Data Science capstone projects batch #30

by Ekaterina Butyugina

Person working at their desk
We’re excited to highlight the impressive accomplishments of our February cohort graduates, who completed both the program and their final projects.

Over the last three months, the dedicated students of Zurich’s Batch #30 and Munich’s 9th cohort tackled a wide variety of complex, hands-on challenges. Their passion, technical skill, and perseverance shone through in every project. Take a look at how our graduates are using data science to generate insights, push boundaries, and create real-world impact.


AI Offer Automation

Students: Sarah Cossey, Guillermo Ilbanez, Michael Marty

Arthur Weber AG is a leading Swiss wholesaler specializing in construction materials, building technology, and industrial supplies. With a strong regional presence and decades of industry experience, the company serves professionals across the construction, plumbing, electrical, and HVAC sectors. Arthur Weber is recognized for its customer-centric approach, technical expertise, and commitment to delivering reliable, high-quality solutions tailored to each project's specific needs.

That’s why responding to public tenders is a critical but time-consuming process for them. Each tender can run hundreds of pages and include thousands of requested items, many of which must be carefully matched to the company’s extensive internal product database.

To reduce manual workload and accelerate response time, Arthur Weber partnered with three Constructor Academy data science students to prototype an AI-powered solution.


Solution Overview

The project team, comprising Sarah Cossey, Guillermo Ibanez, and Michael Marty, developed a proof-of-concept pipeline that applies modern natural language processing and semantic search techniques to streamline tender analysis. The core components include:
  • Automated Text Extraction: Tender PDFs are parsed and cleaned to extract relevant product-related information.
  • Item Structuring via GPT-4o-mini: A lightweight language model structures each item description, identifying the product, quantity, and unit, even when information is fragmented or embedded in a technical context.
  • Semantic Product Matching: Using vector embeddings and cosine similarity, each tender item is compared to Arthur Weber’s product database to retrieve top candidate matches.
  • Final Validation with GPT: GPT is then used again to assess and validate the match, simulating the human verification process (see the picture below).

Validation with GPT
Figure: Validation with GPT


Key Challenges

The project was designed with real operational constraints in mind:
  • Long, technical documents with inconsistent formatting
  • High item volume and potential for ambiguous product descriptions
  • Massive catalog of internal products requiring intelligent filtering
  • Frequent mismatches between tender items and catalog availability

Moreover, Arthur Weber’s product database is vast. Ensuring that each extracted item description is complete with all necessary specifications to match the correct database entry is a demanding task.

The AI solution developed by the team helps address these issues by increasing speed, improving accuracy, and establishing a scalable foundation for future automation.
 

The Time Align, Intelligent Assistant in Microsoft Teams project

Students: Finn Jost, Tony Kelly, Victoria Kildyushevskaya  

KI Performance is part of the KI group entity, focusing on AI and Tech Solutions, as well as Innovation Consulting. KI Performance aims to improve the employee experience of filling out timesheets and
 tasks using Blue Ant project management software. It aspired to streamline the process using Microsoft Teams instead. 

To solve this, Finn, Tony, and Victoria developed an app in Microsoft Teams which successfully fetches project and task data from the Blue Ant API and, in a human-like interaction via chat, confirms it with the employee, prompts for a detailed task description, and writes updated information back to the Blue Ant. This way, the additional friction of opening Blue Ant and filling out various fields after each meeting/task is removed.

Key Vault


The security of the access was ensured by using Key Vault. Below is the architecture. Future development could include further integration with tools such as calendars and meeting transcripts to streamline the process even more. 

The team is grateful for the support from KI Performance, which enabled the successful accomplishment of the working prototype in a very short time. 


Conclusion

As we wrap up the Data Science Final Projects with Group #30, we want to express heartfelt thanks to the incredible partner companies who made this journey so much richer. Your real-world challenges provided our students with the opportunity to push boundaries, apply their skills, and deliver creative, practical solutions. We're deeply grateful for your trust and collaboration.

To our outstanding students who joined us back in February, what a ride it’s been! Your hard work, curiosity, and growth over these past months have been nothing short of inspiring. We’re proud of what you’ve accomplished and excited to see where your talents take you next. Keep exploring, keep building, the data world needs more minds like yours.

If you’ve been following along and are feeling inspired, we’d love to welcome you into our next data science cohort. Head over to Constructor Academy to find out how you can start your journey.

Interested in reading more about Constructor Academy and tech related topics? Then check out our other blog posts.

Read more
Blog