Table of Contents
Using the API
The Swissdox@LiRI API allows you to submit a query, to check the status of submitted queries and to download the retrieved data.
Endpoint | Description |
---|---|
/query | Endpoint for submitting a query. Required parameters are name and query. Parameter test can be used to check if the query is correct. Endpoint will return id of a submitted query. |
/status | Returns list of all submitted queries. |
/status/<query_id> | Returns status of query with specific id query_id. |
/download/<filename> | Download of the retrieved dataset. |
For using the API, you first need to create an API key in the Swissdox@LiRI web application (page Projects) for each project separately. To pass the API key to the server you need to specifiy the X-API-Key and X-API-Secret headers.
Submitting a query
The query is defined using a YAML format. Optional arguments are query name, comment, expiration date and a flag to specifiy whether the query should be run or not (for syntax checking). Below you find a simple example in Python for submitting a query using the API.
- swissdox_submit_query.py
import requests headers = { "X-API-Key": "<your-api-key>", "X-API-Secret": "<your-api-secret>" } API_BASE_URL = "https://swissdox.linguistik.uzh.ch/api" API_URL_QUERY = f"{API_BASE_URL}/query" yaml_example = """ query: sources: - ZWA - ZWAS dates: - from: 2022-12-01 to: 2022-12-31 languages: - de - fr content: AND: - OR: - COVID - Corona - NOT: China - NOT: chin* result: format: TSV maxResults: 100 columns: - id - pubtime - medium_code - medium_name - rubric - regional - doctype - doctype_description - language - char_count - dateline - head - subhead - content_id - content version: 1.2 """ data = { "query": yaml_example, "test": "1", "name": "Query name 1", "comment": "Query comment", "expirationDate": "2023-02-28" } r = requests.post( API_URL_QUERY, headers=headers, data=data ) print(r.json())
Checking the status of submitted queries
It is possible to check the status of all submitted queries, as well as status of a certain query with a specific id. The following example shows how to list all submitted queries with their respective statuses:
- swissdox_status.py
import requests headers = { "X-API-Key": "<your-api-key>", "X-API-Secret": "<your-api-secret>" } API_BASE_URL = "https://swissdox.linguistik.uzh.ch/api" API_URL_STATUS = f"{API_BASE_URL}/status" r = requests.get( API_URL_STATUS, headers=headers ) print(r.json())
Download of the retrieved dataset
When you were checking status of your query (like shown in the example above), you also got a download URL in a response, for those queries which are completed. By using this URL in an API request you are able to download your dataset:
- swissdox_download.py
import requests headers = { "X-API-Key": "<your-api-key>", "X-API-Secret": "<your-api-secret>" } API_BASE_URL = "https://swissdox.linguistik.uzh.ch/api" API_URL_DOWNLOAD = f"{API_BASE_URL}/download/6399ae50-0d80-4304-9d2a-d92fa3dc753c__2022_02_28T10_26_03.tsv.xz" r = requests.get( API_URL_DOWNLOAD, headers=headers ) if r.status_code == 200: print("Size of file: %.2f KB" % (len(r.content)/1024)) fp = open("./dataset.tsv.xz", "wb") fp.write(r.content) fp.close() else: print(r.text)