LiRI Wiki

Linguistic Research Infrastructure - University of Zurich

User Tools

Site Tools


langtech:swissdox:api

Using the API

The Swissdox@LiRI API allows you to submit a query, to check the status of submitted queries and to download the retrieved data.

Endpoint Description
/query Endpoint for submitting a query. Required parameters are name and query. Parameter test can be used to check if the query is correct. Endpoint will return id of a submitted query.
/status Returns list of all submitted queries.
/status/<query_id> Returns status of query with specific id query_id.
/download/<filename> Download of the retrieved dataset.

For using the API, you first need to create an API key in the Swissdox@LiRI web application (page Projects) for each project separately. To pass the API key to the server you need to specifiy the X-API-Key and X-API-Secret headers.

Submitting a query

The query is defined using a YAML format. Optional arguments are query name, comment, expiration date and a flag to specifiy whether the query should be run or not (for syntax checking). Below you find a simple example in Python for submitting a query using the API.

swissdox_submit_query.py
import requests
 
headers = {
    "X-API-Key": "<your-api-key>",
    "X-API-Secret": "<your-api-secret>"
}
API_BASE_URL = "https://swissdox.linguistik.uzh.ch/api"
API_URL_QUERY = f"{API_BASE_URL}/query"
 
yaml_example = """
    query:
        sources:
            - ZWA
            - ZWAS
        dates:
            - from: 2022-12-01
              to: 2022-12-31
        languages:
            - de
            - fr
        content:
            AND:
                - OR:
                    - COVID
                    - Corona
                - NOT: China
                - NOT: chin*
    result:
        format: TSV
        maxResults: 100
        columns:
            - id
            - pubtime
            - medium_code
            - medium_name
            - rubric
            - regional
            - doctype
            - doctype_description
            - language
            - char_count
            - dateline
            - head
            - subhead
            - content_id
            - content
    version: 1.2
"""
 
data = {
    "query": yaml_example,
    "test": "1",
    "name": "Query name 1",
    "comment": "Query comment",
    "expirationDate": "2023-02-28"
}
 
r = requests.post(
    API_URL_QUERY,
    headers=headers,
    data=data
)
print(r.json())

Checking the status of submitted queries

It is possible to check the status of all submitted queries, as well as status of a certain query with a specific id. The following example shows how to list all submitted queries with their respective statuses:

swissdox_status.py
import requests
 
headers = {
    "X-API-Key": "<your-api-key>",
    "X-API-Secret": "<your-api-secret>"
}
API_BASE_URL = "https://swissdox.linguistik.uzh.ch/api"
API_URL_STATUS = f"{API_BASE_URL}/status"
 
r = requests.get(
    API_URL_STATUS,
    headers=headers
)
print(r.json())

Download of the retrieved dataset

When you were checking status of your query (like shown in the example above), you also got a download URL in a response, for those queries which are completed. By using this URL in an API request you are able to download your dataset:

swissdox_download.py
import requests
 
headers = {
    "X-API-Key": "<your-api-key>",
    "X-API-Secret": "<your-api-secret>"
}
API_BASE_URL = "https://swissdox.linguistik.uzh.ch/api"
API_URL_DOWNLOAD = f"{API_BASE_URL}/download/6399ae50-0d80-4304-9d2a-d92fa3dc753c__2022_02_28T10_26_03.tsv.xz"
 
r = requests.get(
    API_URL_DOWNLOAD,
    headers=headers
)
if r.status_code == 200:
    print("Size of file: %.2f KB" % (len(r.content)/1024))
    fp = open("./dataset.tsv.xz", "wb")
    fp.write(r.content)
    fp.close()
else:
    print(r.text)
langtech/swissdox/api.txt · Last modified: 2023/04/13 18:08 by Klaus Rothenhäusler

Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki