Vivek's Profile Picture

Hello, I'm

Vivek Chaurasia

Data Scientist

Get To Know More

About Vivek Chaurasia

Hi, I'm Vivek Chaurasia, a graduate student in Artificial Intelligence at RIT with hands-on experience in machine learning, computer vision, NLP, and statistical modeling. I’ve built and deployed real-world AI systems—ranging from RAG-based news assistants to real-time fraud detection—using tools like TensorFlow, PyTorch, AWS, and Docker. I specialize in fine-tuning large language models, applying statistical methods, and implementing research papers from scratch to solve complex problems. I'm passionate about building smart, scalable solutions that make a real impact.

Here Are My

Proficiencies

I specialize in transforming unstructured text into actionable insights using Natural Language Processing. From fine-tuning transformer-based models to deploying end-to-end NLP pipelines, I leverage tools like Python, PyTorch, TensorFlow, Hugging Face, LangChain, and OpenAI APIs to build smart, scalable language systems. Whether it’s text classification, generation, summarization, or semantic search—I bring a deep understanding of both language and learning to solve real-world problems.

Data Analysis Icon

Data Science

Leverage data-driven decision making through rigorous statistical analysis, feature engineering, and machine learning to extract actionable insights and power smarter business strategies.

Data Analytics Icon

Natural Language Processing (NLP)

Unlock insights from unstructured text using cutting-edge NLP techniques—from sentiment analysis and summarization to question answering and LLM fine-tuning—to build systems that truly understand language.

Data Visualization Icon

Cloud, MLOps & AWS

Deploy scalable machine learning solutions using cloud platforms like AWS and GCP. Automate workflows with CI/CD pipelines, model monitoring, and real-time data integration for robust production-grade systems.

Technical Writing Icon

Docker & DevOps

Containerize and deploy models seamlessly across environments using Docker. Ensure reproducibility, scalability, and faster iteration through efficient DevOps practices.

Front End Web Development Icon

Data Analytics

Drive smarter decisions with advanced data analytics—combining statistical modeling, predictive algorithms, and machine learning to uncover patterns, optimize performance, and deliver measurable impact.

Collaboration Icon

Collaboration

Strong believer in cross-functional collaboration—working closely with data scientists, engineers, and domain experts to translate ideas into real-world solutions that deliver value.

Browse Through My

Projects

LLM - Detect AI Generated Text

In an era where AI-generated content is becoming indistinguishable from human writing, I developed an autoencoder-based model to detect AI-generated text with 88% accuracy—a 66% improvement over GAN-based approaches. The system was built using NLP, deep learning, and generative models, with Docker for containerization and MLflow for experiment tracking.

Personalized News Assistant (RAG-Based)

Built an AI-powered news assistant using a Retrieval-Augmented Generation (RAG) system to fetch, summarize, and answer user queries on real-time news articles. Leveraged LangChain, OpenAI, ChromaDB, and BeautifulSoup to ensure accurate and dynamic topic retrieval. Achieved 85% summarization accuracy (ROUGE-1) and deployed optimized APIs for seamless real-time news exploration.

Passive-Aggressive Email Rewriter

Developed an NLP-powered email rewriter that detects and transforms email tone with 93% accuracy. Fine-tuned BERT for tone classification (outperforming XGBoost by 32%) and optimized LLaMA with LoRA for efficient passive-aggressive style rewriting. Deployed on AWS Lambda with S3 storage, ensuring real-time scalability and cost-efficient performance monitoring via CloudWatch.

Image Captioning System Using Deep Learning

Built an AI-driven image captioning model that generates descriptive captions for images using Xception for feature extraction and an LSTM-based sequence model for text generation. Achieved a BLEU score of 0.35—a 40% improvement over the baseline. Deployed on AWS EC2 with Docker and integrated CI/CD pipelines via Jenkins for seamless automation.

Customer Churn Prediction for Subscription Services

Developed a machine learning model to predict customer churn with 94% accuracy, leveraging SQL, Python, and PySpark for data processing on 240K records. Automated the ETL pipeline (SQL → Python → AWS S3) with AWS Lambda, improving efficiency by 40%. Built an interactive Power BI dashboard to visualize churn trends, enabling data-driven decision-making.

Explore My

Other Platforms

Contact Me

Contact Information

Vivek Chaurasia

Austin, Texas