AI Tool for querying natural language on tabular data.
AI Tool for querying natural language on tabular data.Built using QA models from transformers.
This work is described in the following paper:
TableQuery: Querying tabular data with natural language, by Abhijith Neil Abraham, Fariz Rahman and Damanpreet Kaur.
If you use TableQA, please cite the paper.
Here is a detailed blog to understand how this works.
A tabular data can be:
pip install tableqa
git clone https://github.com/abhijithneilabraham/tableQA
cd tableqa
python setup.py install
from tableqa.agent import Agent
agent=Agent(df) #input your dataframe
response=agent.query_db("Your question here")
print(response)
sql=agent.get_query("Your question here")
print(sql) #returns an sql query
{
"name": DATABASE NAME,
"keywords":[DATABASE KEYWORDS],
"columns":
[
{
"name": COLUMN 1 NAME,
"mapping":{
CATEGORY 1: [CATEGORY 1 KEYWORDS],
CATEGORY 2: [CATEGORY 2 KEYWORDS]
}
},
{
"name": COLUMN 2 NAME,
"keywords": [COLUMN 2 KEYWORDS]
},
{
"name": "COLUMN 3 NAME",
"keywords": [COLUMN 3 KEYWORDS],
"summable":"True"
}
]
}
summable
is included for Numeric Type columns whose values are already count representations. Eg. Death Count,Cases
etc. consists values which already represent a count.Example (with manual schema):
from tableqa.agent import Agent
agent=Agent(df,schema) #pass the dataframe and schema objects
response=agent.query_db("how many people died of stomach cancer in 2011")
print(response)
#Response =[(22,)]
from tableqa.agent import Agent
agent = Agent(df, schema_file, 'postgres', username='username', password='password', database='DBname', host='localhost', port=5432, aws_db=False)
response=agent.query_db("how many people died of stomach cancer in 2011")
print(response)
#Response =[(22,)]
from tableqa.agent import Agent
agent = Agent(df, schema_file, 'mysql', username='username', password='password', database='DBname', host='localhost', port=5432, aws_db=False)
response=agent.query_db("how many people died of stomach cancer in 2011")
print(response)
#Response =[(22,)]
Refer to step 1 in the document to create a mysql db instance on Amazon RDS. Same steps can be followed for creating a PostgreSQL db instance by selecting PostgreSQL in the Engine tab. Obtain the username, password, database, endpoint, and port from your database connection details on Amazon RDS.
from tableqa.agent import Agent
agent = Agent(df, schema_file, 'postgres', username='Master username', password='Master password', database='DB name', host='Endpoint', port='Port', aws_db=True)
response=agent.query_db("how many people died of stomach cancer in 2011")
print(response)
#Response =[(22,)]
sql=agent.get_query("How many people died of stomach cancer in 2011")
print(sql)
#sql query: SELECT SUM(Death_Count) FROM cancer_death WHERE Cancer_site = "Stomach" AND Year = "2011"
csv_path="/content/tableQA/tableqa/cleaned_data"
schema_path="/content/tableQA/tableqa/schema"
agent=Agent(csv_path,schema_path)
csv_path="s3://{bucket}/cleaned_data"
schema_path="s3://{bucket}/schema"
agent = Agent(csv_path, schema_path, aws_s3=True, access_key_id=access_key_id, secret_access_key=secret_access_key)
Join our workspace:Slack