Plant ID app (part 2): REST API
In part 1 of this blog post, we downloaded ~25.000 images of 100 plant species and trained a deep learning classification model. The 100 plant species are included in the Danish stream plant index (DVPI). In part 2, we create a REST API with endpoints/services that can be accessed from a very simple landing page.
All code from parts 1 and 2 of this blog post can be found on GitHub.
The application
The purpose of the REST API is to serve our landing page, a simple ‘.html’ document, which can be used to access the two essential services: the plant identification model and determining the DVPI score using an external API. This results in three API endpoints:
- Landing page with a simple interface for uploading ‘.csv’ file for determining DVPI score or images for species identification
- Species identification using a deep learning model
- Determine the DVPI score for a stream plant community by calling an external (SOAP) API endpoint
The REST API has been created using FastAPI which is modern and well-documented. It is contained in a single file where the endpoints are defined:
from typing import List
from fastapi import FastAPI, HTTPException, File, UploadFile, Request
from pydantic import BaseModel
import zeep
import pickle
from fastapi.staticfiles import StaticFiles
from fastapi.middleware.cors import CORSMiddleware
from PIL import Image
from io import BytesIO
from fastai.vision.all import *
import numpy as np
from fastapi.templating import Jinja2Templates
app = FastAPI()
app.add_middleware(
CORSMiddleware,
allow_origins=["*"],
allow_credentials=True,
allow_methods=["*"],
allow_headers=["*"],
)
###### DVPI endpoint ######
id_latin_dict = pickle.load(open("data/id_latin_dict.p", "rb"))
#Create SOAP client interface using zeep
wsdl = 'http://service.dvpi.au.dk/1.0.0/DCE_DVPI.svc?singleWsdl'
client = zeep.Client(wsdl=wsdl)
def parse_response(resp):
resp_split = resp.split(" ")
resp_dict = {i.split("=")[0]: float(i.split("=")[1].strip('\"')) for i in resp_split[1:4]}
return(resp_dict)
class Item(BaseModel):
art: str
dkg: str
#Define request handler for requesting DVPI score from external API
#from text file with species names and cover (see example text file)
@app.post("/dvpi")
async def get_dvpi(items: List[Item]):
spec_list = [i.art for i in items]
dkg_list = [i.dkg for i in items]
id_list = [id_latin_dict.get(i) for i in spec_list]
if None in id_list:
id_none = [i for i, j in zip(spec_list, id_list) if j == None]
raise HTTPException(status_code=404, detail= "Stancode not found for: " + ", ".join(id_none))
body_items = ['<sc1064 ID="' + i + '" DKG="' + c + '" />' for i, c in zip(id_list, dkg_list)]
request = "<DVPI_Input>" + "".join(body_items) + "</DVPI_Input>"
response = client.service.DVPI(request)
response_parsed = parse_response(response)
return response_parsed
###### Plant ID endpoint ######
#Load plant species image classification model
model_weights = "data/model/effnet_b0.export"
model = load_learner(model_weights)
taxon_key_dict = pickle.load(open("data/taxon_key_dict.p", "rb"))
#Define request handler for image classification which returns top-5 most likely species
@app.post("/predict")
async def predict_image(file: UploadFile = File(...)):
request_content = await file.read()
try:
image = np.array(Image.open(BytesIO(request_content)))
except:
raise HTTPException(status_code=422, detail="Unable to process file")
_, _, probs = model.predict(image)
#Get top-5 highest predictions
_, idx = probs.topk(5)
top_5_labels = model.dls.vocab[idx]
#Create lable
label = ", ".join(["{} {}%".format(taxon_key_dict[l], int(probs[i]*100)) for l, i in zip(top_5_labels, idx)])
return {"response": label}
###### Static files ######
app.mount("/static", StaticFiles(directory="static/"), name="static")
templates = Jinja2Templates(directory="templates")
#Define request handler for landing page
@app.get("/")
async def root(request: Request):
return templates.TemplateResponse("index.html", {"request": request})
A brief walk-through:
- The ‘dvpi/’ endpoint prepares the received JSON string in a specific XML format which is expected by the external SOAP API which is called using the zeep Python library. Finally, it receives a string containing the resulting DVPI score and returns it to the front-end
- The ‘predict/’ endpoint converts the received image bytes to a NumPy array and passes it through the model that was developed in part 1. The top-5 highest species scores are then returned to the front-end
- The ‘/’ (root) endpoint simply returns a ‘.html’ file which serves as the front-end.
Front-end
Creating the landing pages makes the services exposed by the REST API consumable by everyone. It is a super simple HTML template (the text is in Danish):
<!DOCTYPE html>
<html>
<head>
<link rel="stylesheet" href="{{ url_for('static', path='style.css') }}">
<script src="{{ url_for('static', path='lib/papaparse.min.js') }}"></script>
<script src="{{ url_for('static', path='script_csv.js') }}"></script>
<script src="{{ url_for('static', path='script_image.js') }}"></script>
</head>
<body>
<h1>Dansk vandplante indeks (DVPI)</h1>
<div class="main-container">
<div class = "calc-container">
<h2>DVPI beregner</h1>
<p>Upload '.csv' tekstfil med artsnavn og dækningsgrad (0-100%) formateret som:</p>
<i>art,dkg
<br>
Trådalger,60
<br>
Ranunculus repens,35.5
<br>
Berula,4.5
<br>
</i>
<br>
<a href="{{ url_for('static', path='example.csv') }}" download="eksempel.csv">Hent eksempel på '.csv' fil</a>
<br>
<br>
<input type="file" id="fileUpload" accept=".csv" onchange="parse_csv(this)"/>
<br>
<br>
<div id="dvpi_result"></div>
</div>
<div class = "classif-container">
<h2>DVPI billed identifikation</h1>
<p>Upload billedfil (for eksempel .png, .jpeg, .jpg, .tif, etc.)</p>
<p>Identifikation af de 100 mest almindelige arter som indgår DVPI</p>
<a href="{{ url_for('static', path='taxon_list.html') }}">Se artsliste</a>
<br>
<br>
<input type="file" id="imageUpload" onchange= "upload_image(this)" />
<br>
<br>
<div id="dvpi_art"></div>
</div>
</div>
</body>
</html>
The two endpoints are accessed using two basic JavaScript functions. One to process a ‘.csv’ file with plant species and cover using the PapaParse js library perform the HTTP post request:
function parse_csv() {
var file = document.getElementById('fileUpload').files[0];
Papa.parse(file, {
header: true,
complete: function(results) {
var data = results.data;
fetch("http://127.0.0.1:8000/dvpi", {
method: "POST",
headers: {'Content-Type': 'application/json', 'accept': 'application/json'},
body: JSON.stringify(data)
}).then(response => response.text())
.then(data => document.getElementById('dvpi_result').textContent = `Resultat = ` + data);
}
});
}
And the function using the uploaded image to perform HTTP post request:
function upload_image() {
var file = document.getElementById('imageUpload').files[0];
let data = new FormData();
data.append('file', file);
fetch('http://127.0.0.1:8000/predict', {
method: 'POST',
body: data
}).then(response => response.json())
.then(data => document.getElementById('dvpi_art').textContent = `Resultat = ` + data.response)
.then(data => console.log(data))
}
This is the essential functionality. Additionally, links to an example ‘.csv’ is added, a list of all the species included in the plant identification model, and a tiny bit of CSS is added:
.calc-container {
border-style: solid;
border-width: 4px;
border-radius: 4px;
}
.classif-container {
border-style: solid;
border-width: 4px;
border-radius: 4px;
}
.main-container {
display: grid;
grid-template-columns: 1fr 1fr;
grid-gap: 20px;
}
* {
font-family: Helvetica;
}
h1 {
text-align:center;
}
The final result is a web application that admittedly looks very basic but exposes useful services from the REST API.
Landing page with results returned from two services: one determining DVPI score from an input ‘.csv’ file and the other identifying plant species from images
Concluding remarks
This blog post started with a lot of images downloaded from GBIF, trained a convolutional neural network, and wrapped it into a web application that can be served online. The basic front-end serves the purpose of illustration and the magic takes place behind the scenes. The repository contains additional files necessary to reproduce the runtime environment and deploy the application on Heroku for example.