Foreground Image
The sky is the limit text
Background Image
Cloud
Expanded view
Close
Info Icon
Bottom Left Corner Top Right Corner

Hi! I'm Laura Allison Obermaier

Master’s student in Software Engineering at Stevens Institute of Technology (’25) with graduate certificates in Machine Learning and Cybersecurity. Earned my B.S. in Computer Science from Florida State University (’23), graduating with a 3.97 GPA and honors distinctions.

This summer, I’m interning as a full-stack developer at NutriverseAI and serving as a course assistant for Advanced Algorithms. I also conducted undergraduate research under Dr. Jonathan Adams at FSU. Currently seeking full/part-time roles beginning 2026.

Application Overview

This project features ingredient list tokenization, fuzzy matching, alias retrieval, semantic filters, keyword matching, web scraping, summary generation, evaluation metrics, and much more. The aim, hereby, is to showcase the power of preprocessing and task specialization in reducing hallucination and increasing despite decreased model size and complexity.

Project Report

This project report features a detailed description of the project, including abstract, introduction, related work, methodology, experimental setup, results, and conclusion and future work.

Tokenization and Alias Normalization

The ingredient list was first tokenized into individual ingredients and normalized via fuzzy matching. Then, aliases were retrieved by using inflect for pluralization and singularization and by using the Mistral-7B-Instruct-GPTQ model to generate common name and scientific name aliases. These were then filtered and normalized for use during data retrieval.

Data Retrieval and Web Scraping 1/3

Data was retrieved from trusted sites, such as PubMed, OpenFDA, RxNorm, Europe PubMed Central, PubMed Central, and via web scraping using Google Custom Search Engine (CSE). During web scraping, NER entities were extracted from the retrieved data. The search included only matches for health-related keywords and ingredient aliases. Non-human studies, non-relevant NER-labeled data, product-related information, and more was discarded using filters to obtain only relevant data.

Data Retrieval and Web Scraping 2/3

Data was retrieved from trusted sites, such as PubMed, OpenFDA, RxNorm, Europe PubMed Central, PubMed Central, and via web scraping using Google Custom Search Engine (CSE). During web scraping, NER entities were extracted from the retrieved data. The search included only matches for health-related keywords and ingredient aliases. Non-human studies, non-relevant NER-labeled data, product-related information, and more was discarded using filters to obtain only relevant data.

Data Retrieval and Web Scraping 3/3

Data was retrieved from trusted sites, such as PubMed, OpenFDA, RxNorm, Europe PubMed Central, PubMed Central, and via web scraping using Google Custom Search Engine (CSE). During web scraping, NER entities were extracted from the retrieved data. The search included only matches for health-related keywords and ingredient aliases. Non-human studies, non-relevant NER-labeled data, product-related information, and more was discarded using filters to obtain only relevant data.

Sentence-Level Preprocessing and Summary Generation 1/3

The data was parsed into sentences and filtered to only contain those featuring results and conclusion-related terms, an alias, and health-related terms. Sentences were then normalized by removing duplicates and replacing all aliases with the originally matched ingredient name. Then, summaries of health effects and dietary restrictions were generated for each ingredient using the Mistral-7B-Instruct-GPTQ-Model and AutoGPTQ. These were then post-processed to remove duplicates and irrelevant information. The top 5 summaries were selected and displayed.

Sentence-Level Preprocessing and Summary Generation 2/3

The data was parsed into sentences and filtered to only contain those featuring results and conclusion-related terms, an alias, and health-related terms. Sentences were then normalized by removing duplicates and replacing all aliases with the originally matched ingredient name. Then, summaries of health effects and dietary restrictions were generated for each ingredient using the Mistral-7B-Instruct-GPTQ-Model and AutoGPTQ. These were then post-processed to remove duplicates and irrelevant information. The top 5 summaries were selected and displayed.

Sentence-Level Preprocessing and Summary Generation 3/3

The data was parsed into sentences and filtered to only contain those featuring results and conclusion-related terms, an alias, and health-related terms. Sentences were then normalized by removing duplicates and replacing all aliases with the originally matched ingredient name. Then, summaries of health effects and dietary restrictions were generated for each ingredient using the Mistral-7B-Instruct-GPTQ-Model and AutoGPTQ. These were then post-processed to remove duplicates and irrelevant information. The top 5 summaries were selected and displayed.

Automatic Evaluation 1/2

We evaluated summaries using ROUGE-1, ROUGE-L, and BLEUScore-F1 across three groups: rare (protein powder), common, and sugar ingredients (image 2; row order respectively). BLEUScore-F1 consistently outperformed ROUGE, suggesting our model prioritizes semantic clarity over lexical overlap. Right-column scores reflect improved performance after alias integration.

Automatic Evaluation 2/2

We evaluated summaries using ROUGE-1, ROUGE-L, and BLEUScore-F1 across three groups: rare (protein powder), common, and sugar ingredients (image 2; row order respectively). BLEUScore-F1 consistently outperformed ROUGE, suggesting our model prioritizes semantic clarity over lexical overlap. Right-column scores reflect improved performance after alias integration.

Human Evaluation 1/4

21 participants (19 via social media, 2 via NJ transit) rated our summaries in a two-part survey. Part A used a 1–10 scale to evaluate informativeness, accuracy, and clarity. Part B compared our summaries to ChatGPT-4o’s. Results showed our model was informative and understandable, with no strong preference between the two models. Feedback led to improvements (see strikethroughs in image 2).

Human Evaluation 2/4

21 participants (19 via social media, 2 via NJ transit) rated our summaries in a two-part survey. Part A used a 1–10 scale to evaluate informativeness, accuracy, and clarity. Part B compared our summaries to ChatGPT-4o’s. Results showed our model was informative and understandable, with no strong preference between the two models. Feedback led to improvements (see strikethroughs in image 2).

Human Evaluation 3/4

21 participants (19 via social media, 2 via NJ transit) rated our summaries in a two-part survey. Part A used a 1–10 scale to evaluate informativeness, accuracy, and clarity. Part B compared our summaries to ChatGPT-4o’s. Results showed our model was informative and understandable, with no strong preference between the two models. Feedback led to improvements (see strikethroughs in image 2).

Human Evaluation 4/4

21 participants (19 via social media, 2 via NJ transit) rated our summaries in a two-part survey. Part A used a 1–10 scale to evaluate informativeness, accuracy, and clarity. Part B compared our summaries to ChatGPT-4o’s. Results showed our model was informative and understandable, with no strong preference between the two models. Feedback led to improvements (see strikethroughs in image 2).

Key Takeaways and Repository

Both human and automatic evaluations show our model performs competitively with ChatGPT-4o. This supports the potential of smaller, specialized models for domain tasks. Future directions include RLHF, larger models, improved aliasing, sentiment analysis, and hybrid systems. Click here to view the repo.

Application Overview

This project combined ML models using stacking (LogReg, Decision Trees, Random Forests, Gradient Boosting). Bias-variance analysis was included. The best combo, LogReg + Random Forest, achieved 98.66% accuracy with a bias² + variance of 0.5. Final output: bankruptcy predictions in CSV format.

Code Snippet 1/7

For a comprehensive demonstration of the development skills utilized in this project, please visit the project repository. These code snippets offer a glimpse into the various capabilities employed throughout the project.

Dataset Import and Scaling

In this step, the dataset is loaded using Pandas, and features are extracted. A StandardScaler is applied to normalize the data for better model performance.

Application Overview

The HackerNews Flask application provides a robust and secure platform for delivering the latest news from the Hacker News website, updated hourly. Utilizing nginx and gunicorn for its configuration and SQLite for data management, it offers a streamlined user experience with features like pagination, and user interaction through likes and dislikes for logged-in users. Authentication is managed via Auth0, enabling users to sign in with Google. The application includes several functional routes, such as a profile view for account information and an admin interface for managing user roles. Continuous updates and user interactions are facilitated on a cleanly designed interface, ensuring that users remain engaged and informed.

Application Overview

This comprehensive web application offers robust law firm management tools, facilitating efficient handling of clients, projects, and staff. It supports full CRUD operations and features advanced utilities like a windowed timer, mass-billing capabilities, customizable interface options, and secure data deletion protocols.

Application Overview

This Android application provides comprehensive note management and customization features. Users can adjust text color, background color, font size, and text alignment. It supports saving, deleting, and undoing changes to notes, which can be displayed individually or in a card/recycle view format.

Application Overview

ProFessUp is a platform modeled after Rate My Professor, designed to facilitate CRUD operations for professors, reviews, courses, and user profiles. The application enables users to efficiently search for professors and filter reviews by course. It presents professor profiles and reviews in a card-based layout, utilizing sliders, drop-down menus, buttons, and checkboxes to enhance usability and comprehension. The interface employs a user-friendly color scheme to optimize the user experience. Due to the deletion of the MongoDB, note functionality is demonstrated exclusively through video in this portfolio project.

Program Overview

This program facilitates the manipulation of FAT32 images through a shell-like interface, supporting commands such as ls, cd, mkdir, creat, open, close, read, append, lseek, lsof, rm, and rm -r.

Program Overview

This program is designed to simulate a shell environment. It incorporates a comprehensive set of functionalities, including both internal and external command processing, as well as the capability to execute an external program. These features collectively emulate the typical operations of a conventional shell.

Program Overview

This program utilizes an advanced elevator scheduling algorithm that interfaces seamlessly with a custom kernel module. It incorporates system calls, efficient scheduling techniques, kernel operations, mutexes/locks, and multithreading to optimize performance and ensure robust system integration.
LinkedIn Instagram GitHub Resume