Jul-10-2024, 04:49 PM
I am developing a chatbot ,which fetches the required data from mongodb based on question user asks from chatbot ex: if user asks total number of employees,it shld return the number of employees stored in employee collection
what i have done
1.Load the employee collection from db into csv file.
2.Then i am using the transformers pipeline to get the answer as below
Also,some times i get the incomplete answer
for ex:if i ask "what is employeename who is working in python",i get only the first employee working in python and not all the employees working in python
my output is as below ,but there are other employees also working in python
and,is there any other way or model through which better suits my requirement
what i have done
1.Load the employee collection from db into csv file.
2.Then i am using the transformers pipeline to get the answer as below
def mongo_to_csv(collection_name,input_text):
# Connect to MongoDB
client = MongoClient(mongo_uri)
db = client[database_name]
collection = db[collection_name]
data = collection.find()
# Extract all field names from the documents to handle varying fields
all_keys = set()
for document in data:
all_keys.update(document.keys())
# Re-query the collection since the previous iteration exhausted the cursor
data = collection.find()
# Open a CSV file to write
with open(f"{collection_name}.csv", mode='a', newline='', encoding='utf-8') as csv_file:
# Create a CSV DictWriter
csv_writer = csv.DictWriter(csv_file, fieldnames=all_keys)
# Write the header (field names)
csv_writer.writeheader()
# Write the data rows
for document in data:
csv_writer.writerow(document)
print(
f"Data from {collection_name} collection in {database_name} database has been written to {collection_name}.csv"
)
return responses_from_db(collection_name,input_text)
def responses_from_db(collection_name,input_text):
tqa = pipeline(task="table-question-answering", model="google/tapas-base-finetuned-wtq")
table = read_csv(f"{collection_name}.csv")
table = pd.DataFrame.from_dict(table)
table = table.astype(str)
print(table)
query = input_text
print(tqa(table=table, query=query)['answer'])
output=tqa(table=table, query=query)['answer']
return output
mongo_to_csv("employee_collection","how many employees present ")for a user input -"how many employees present" ,this is my outputOutput:"The interns details requested are - 25875, 30503, 49530"i want the count but i am getting the employee id of all employees present(as only 3 entries present in my employee collection)Also,some times i get the incomplete answer
for ex:if i ask "what is employeename who is working in python",i get only the first employee working in python and not all the employees working in python
my output is as below ,but there are other employees also working in python
Output:"The details are - john"can my code be modified to better suit my requirement and,is there any other way or model through which better suits my requirement
