- 1. Installations, Dependencies, Build & Deployment
- 2. Cloud Infrastructure
- 3. Code
- 4. API Tests
- 5. API Documentation
- 6. Miscellaneous Notes
The Romaji2Kana API project only requires Javascript Runtime Node.js 18 (LTS)
to be installed. On my local computer I use Node v18.12.1 and on AWS the Runtime Node.js 18.x is selected.
For testing the API the Postman
API Platform is used, which can be installed (recommended) or used in the web version.
The dependencies are managed via the npm
package manager, that comes with Node.js. They can be installed by running the command npm install
. And listed via npm list
.
- wanakana@5.1.0: a JavaScript library for detecting and transliterating Hiragana, Katakana and Romaji in all directions
On top of that, the local development version of the API (lying in this main
branch) additionally has a server framework installed to serve the API.
See 3. Code Structure for a comparison between the local development and the deployed Cloud version (
main
vs.release
branch).
- express@4.18.2: a fast, unopinionated, minimalist web framework for Node.js
Keep the dependencies up to date:
- run
npm outdated
. This will check every installed dependency and compare the current version with the latest version in the npm registry and print results in a nice table. - check the changeslogs of the respective dependencies for breaking changes
- run
npm update
To update to a new major version (which often has breaking changes), you need to run npm install <package>@latest
manually. See How to Update NPM Dependencies by freecodecamp.org.
Execute the following commands in the working directory of the project.
For the local development version in main
branch:
- Install all the dependencies with
npm install
. - Serve the app with
node .
(takes default entrypointindex.js
) ornode index.js
. - Test it by running requests against it, e.g. http://localhost:3000/v1/is/japanese?q=わたし should return
{"result":true}
For the cloud deployment version in release
branch:
- Install all the dependencies with
npm install
.
To deploy the production version, switch to the
release
branch.
Deploy the API to the cloud into production:
- automatically: run
npm run deploy
(to understand seescript
section ofpackage.json
) - manually: make a ZIP file of the whole working directory and upload it as code to the Lambda function
romaji2kana-api
via the AWS Management Console
Important: from here we can only change behaviour of the Lambda function. If we'd like to change major things about the API, like adding or removing endpoints, we'd have to go to API Gateway Management Console and edit it there! Mind the whole infrastructure, see 2. Cloud Infrastructure.
The following diagram shows the cloud infrastructure of the API in AWS.
The explanation follows one chapter at a time.
Here are two records:
- an A record pointing the subdomain
api.romaji2kana.com
to the cloudfront distribution's domain named29lm7o1eb303u.cloudfront.net
- a CNAME record
....api.romaji2kana.com
pointing to....acm-validations.aws
: proves to the Certificate Manager that we own the domain
This CloudFront distribution (of our API positioned behind it) is almost invisible to us as a user. It has been created and is being fully managed for us by AWS, because we chose "edge-optimized" as API endpoint type instead of "regional".
We get to see the endpoint however, because we ourselves needed to create a DNS A record of api.romaji2kana.com
to point this distribution's endpoint, as said in 2.1. Amazon Route 53.
Look the distribution domain name up:
API Gateway > Custom domain nnames >api.romaji2kana.com
> Configurations > API Gateway domain name:d29lm7o1eb303u.cloudfront.net
The API implements TLS (SSL) encryption, by using a certificate from AWS Certificate Manager. It shares this certificate with the Romaji2Kana website (I added both romaji2kana.com
and api.romaji2kana.com
domain names to this certificate).
The Cloudfront distribution uses this certificate. I configured this in API Gateway, because - as mentioned - we get the distribution fully managed there.
Look the certificate configuration up:
API Gateway > Custom domain nnames >api.romaji2kana.com
> Configurations > ACM certificate ARN:arn:aws:acm:us-east-1:617879802663:certificate/34342583-9219-4ff8-b211-5ab260410f3e
This API Gateway is of the type REST API type, because we want to keep the possibility open to monetize the API in the future, which requires features such as API keys, Per-client rate limiting, usage plans etc. - therefor HTTP API type is not enough.
The API endpoint type is "Edge" (which deployed our API to CloudFront, serving it via the CDN!). This makes it significantly faster just like the also CloudFront delivered Romaji2Kana website.
The only API Stage is prod
. The domain name api.romaji2kana.com
is configured to be map to this stage (see API Gateway > Custom domain names > api.romaji2kana.com
> API mappings).
Visiting the API Management Console > romaji2kana-api
> Resources, you see the full resource tree. Only the leaves of the tree are configured to be accessible (by having methods (GET, OPTIONS) defined). They can be reached by posing a request with the method to the Path seen under "Resource details". The following resource paths exist:
/v1/is/japanese
/v1/is/kana
/v1/is/hiragana
/v1/is/katakana
/v1/is/romaji
/v1/is/mixed
/v1/to/kana
/v1/to/hiragana
/v1/to/katakana
/v1/to/romaji
The v1 resource exists to enable easily developing and publishing new versions of the API and still supporting the previous ones and staying in a consistent naming schema.
You can reach each endpoint directly by the API Gateway Invoke URL or via the custom domain name:
For all resources CORS is enabled (because else websites using this API in their websites/webapps would get a CORS error, saying that the website example.com
may not send requests to another origin, like api.romaji2kana.com
). I also created an OPTIONS method alongside every GET method (which also has CORS enabled on itself): it answers preflight requests and tells them CORS is enabled. We have to do this OPTIONS method on every resource - it does NOT apply recursively down the tree.
See explanation of AWS (source for above text) and the developer guides on how to enable CORS and how to test CORS.
Now when a client makes a GET request to a resource path (one of the 10), it gets forwarded to the Lambda function, which is configured for the method under "Integration request". This request is packed into an event
object and passed to the Lambda's handler(event)
function. It is not processed/modified by API Gateway, because the Lambda is integrated via proxy integration, which means API Gateway just acts as a proxy, forwarding the requests to the lambda as-is (included in the event
object).
The lambda function does all the API functionality in 27 lines of code (more on the code in 3. Code).
- It reads the event object to determine which resource has been requested (actually
event.resource
is identical toevent.path
here, they both contain e.g./v1/is/hiragana
). - It uses the
event.queryStringParameters.q
payload to read theq=xyz
query string from the original request. - Thus knowing what to do, it uses the right function from the wanakana library to do the operation.
- SPECIAL: It returns the result responding in a API Gateway understood format (more in 3.3.
release
branch).
You can test the Lambda function in the Management Console. I have configured a test event "isHiragana1".
Note that this Lambda function is not on edge locations (Lambda@Edge), it's in
eu-central-1 (Frankfurt)
. This does not negate the edge-optimization of before services, but instead provides the highest performance in our case:
- It's not that bad: After API requests have come to AWS at edge locations (when calling the API), they are in the AWS Global Network. Traffic over this AWS infrastructure is much faster than over the public Internet.
- For us its even good: As long as there is not continuous traffic on most edge locations (which will presumably not be the case any time soon for this API), those Lambdas@Edge would not improve performance, because Lambda functions go sleep after about 5-7 minutes of inactivity. The following "cold start" results in a request taking significantly longer than usual, because the AWS Lambda first needs to newly download your code and start an execution environment again.
The Lambda function writes its logs to the Amazon CloudWatch loggroup /aws/lambda/romaji2kana-api
.
Currently only START and END of requests are logged, together with (1) a timestamp and (2) a requestId.
We could log anything though, with console.log()
in the Lambda function.
main
branch: local development version of the APIrelease
branch: cloud deployed version of the API
Both branches share much in common.
- they both receive an HTTP request
- the local development API does anyways, because it gets called directly on
localhost:3000
- the cloud deployed API gets the HTTP request forwarded from API Gateway packed in the event object, by which the Lambda function is triggered.
- the local development API does anyways, because it gets called directly on
- they both rely on a handler function that does the heavy lifting
- they both analyze which API resource has been requested
- they both just use a wanakana function to fulfill a request
- uses Express Js to serve the API. Hence, there is some code (
import express, express(), app.listen(), app.get('/'),...
) from it there, to start the server and listen on specified endpoints. - artificially creates an event like AWS would send one to trigger the Lambda function (this lets us copy the code to the
release
branch better!)
- it's largely just the
main
branch stripped off of everything but the processing of request path and query parameter and fulfillment of the request with the wanakana library - a major novelty however is the
formatResponse()
function, that wraps the body payload into a JSON object in the "Lambda function response format 1.0" (as in this official reference lambda). That is a special format required by API Gateway, so it knows what to pass on to the client. Regular HTTP APIs now support format 2.0, which would infer what we explicitly define there - but we have picked the REST API type, which only support 1.0.
If sometime explicit error handling should be implemented, do it like this official reference lambda and knowing the header
x-amzn-ErrorType
is required.
There is an extensive test suite for the API api.romaji2kana.com
with 180+ tests in my Postman account.
The JSON Schema Validation of the response body is done with Ajv schema validator.
On concerns, whether or not you can pass very large texts as query parameter: yes, you can.
Romaji | Hiragana | Katakana | WithKanji |
---|---|---|---|
yuubinkyoku | ゆうびんきょく | ||
aisukuri-mu | アイスクリーム | ||
sugoi! | すごい! | ||
Ohayou gozaimasu. | おはよう ございます。 | ||
Ogenki desuka? | おげんき ですか? | お元気ですか? | |
kotoba | ことば | 言葉 |
Sentence for Romaji<->Kana: Hajimemashite. REANDA-san desu. Yoroshiku onegai shimasu.
Attention with
is/{japanese, kana, hiragana, katakana}
: Japanese don't use spaces at all!! (For the conversion functions spaces in Japanese are supported and translate over to Romaji spaces)
Attention withis/{kana, hiragana, katakana}
: no sentence marks allowed!!
The API documentation is published on the Romaji2Kana website on the API page.
HTTP request has...
- ... no query params:
"queryStringParameters": null
- ... has q param:
"queryStringParameters": {
"q": "わたしはおとこです。"
}
- ... has q param, but left it empty (
q=
):
"queryStringParameters": {
"q": ""
}
- ... has other param (
x=532
):
"queryStringParameters": {
"x": "532"
}
- ... has multiple params, including the q param (
q=watata&x=532
):
"queryStringParameters": {
"q": "watata",
"x": "532"
}