This cover basic Data Science and Machine Learning, Natural Language Processing, and Large Language Modelling. These projects are often made for research and competition purposes.

VIVYNet, Text-to-Music Generative AI Model

Researching on leveraging Transformers Encoder-Decoder and Large Language Model architectures to create a multimodal text to music generative AI model. Specifically, we are examining our concept performance when text-based AI encoder is merged with music-based AI decoder. This project currently involves the use of many research Python libraries such as Fairseq, FastTransformer, and PyTorch, and will be trained on a dataset with more than 100,000 samples.

Air Quality Prediction

* Investigated and predicted air quality for Uganda in the contest.
* Processed the tabular dataset by filling in missing values, scaling data, converting categorical values, dropping overfitting features, which helped reduce Mean Absolute Error score from 20 to 17.
* Trained and tuned the neural network to fit the dataset, which improved our score from 17 to 16.69 Air Quality Prediction


Developed the Vietnamese chat bot in Python using TensorFlow, Firebase, and Neo4j that could help students cope better with stress, loneliness during COVID quarantine, and study better with less procrastination. Designed a custom Bi-gram algorithm and Word2Vec embedding layer to specifically process Vietnamese text dataset, which increased language comprehension more than 25%. Implemented a multi-task model that performs Intent Classification, POS Tagging, and Knowledge Reasoning with BiLSTM and Neo4j graph database, which achieved 80% in accuracy.


