Job Details

Internship - Data Science Student T&I Bergen op Zoom

Reference Number
Bergen op Zoom
Job Type
Vacancy Category
Research & Development


SABIC is a global leader in diversified chemicals with operations in more than 50 countries and over 35,000 employees worldwide. The Analytical Technology team (>30 FTEs) is part of the SABIC Corporate Technology and Innovation (T&I) organization and provides analytical characterization expertise to support technology programs across Strategic Business Units such as PETCHEM and SPECIALTIES. The chemical characterization laboratories are well equipped with an extensive array of modern analytical techniques in the area of advanced chromatography, spectroscopy, elemental analysis, morphology and microscopy.

At SABIC we attract diverse talent and develop their leadership and technical skills. Currently we are looking for a motivated and independent student for a 6 months data science internship at SABIC (The Netherlands). This is a remarkable opportunity to work with data sets from state-of-the-art chromatography and mass spectrometry equipment.

Job Purpose & Responsibilities

Project: Alignment of Comprehensive Gas Chromatography Signals
To develop transformation functions to align the retention times of chromatograms acquired by comprehensive two-dimensional gas chromatography with a flame ionization detector (FID) and a mass spectrometer (MS) and subsequently global alignment of all FID data.
Comprehensive GC is a separation technique that combines two separating columns allowing detailed analysis of complex samples. Being able to do this in a repeatable way is very important however and we currently experience a run to run shift in the data on the time axis (retention time) that makes it difficult to compare profiles (chromatograms) obtained from multiple samples exposed to different experimental conditions. Thus, for comparison purpose, a retention-time transformation function is required to align the data-points and/or features such as peaks in mulitple chromatograms. Such an effective alignment tool could reduce or eliminate the need for manual intervention, thus increasing consistency of the results over a period of time.
A few profile alignment methods are available based on the generalized rank annihilation, correlation-optimized shifting, a piecewise retention time alignment algorithm as well as 2D correlation optimized warping (2-D COW) method for warping the comprehensive GC data. However, all the profile alignment methods align the data based on retention times alone. There is an extra dimension of data available in the data file being the mass spectrum (MS) of each peak. A mass spectrum represents the masses in a chemical structure represented by bars and patterns resulting from the behaviour of the chemical in the mass spectrometer. Aligning compound peaks solely based on the retention times may introduce a high rate of false-positive alignment with similar chemical functional groups. Thus, combination of retention time and mass spectra based alignment approach need to be considered.
Dual detection of a reference sample using retention time and MS enables chemical-structure elucidation and compound identification by MS allowing to correlate the compounds.
A reference sample, well characterized for compound identification by MS, will be used to provide reference points for targeted compounds for each set of samples analysed in a sequence.
Based on the reference data points, for successive runs, data fusion can be accomplished by aligning the chromatographic data and/or chromatographic features such as peaks and retention-time windows. Chromatographic alignment requires a matching of compounds from the retention times of one chromatogram to the retention times of the other chromatogram.
This project will focus on general issues and experimental performance for global two-dimensional mapping functions to align pairs/series of chromatograms. Inherent variability in retention times needs to be corrected by different degrees of polynomial functions for parametric and computational simplicity as well as robustness.
The performance of the global alignment model(s) will be evaluated to determine their representational effectiveness for retention times of matched peaks in retention time and MS chromatograms as well as between matched peaks for the retention time in consecutive replicate runs and sequences over a period of time. Performance benchmarks need to be established considering several aspects of chromatographic misalignments and residual differences.
An optimal peak alignment for comprehensive two-dimensional gas chromatography mass spectrometry using mixture similarity measure. Bioinformatics. 2011 Jun 15; 27(12): 1660–1666.
Alignment for Comprehensive Two-Dimensional Gas Chromatography with Dual Secondary Columns and Detectors. Anal. Chem. 87, 19, 10056-10063


  • Data Science background with the ability to manipulate large datasets
  • Good command of Matlab
  • Interest and ability to learn chromatography and mass spectroscopy
  • Flexible, creative and independent
  • Good interpersonal skills
  • Ability to organize, prioritize and handle multiple projects simultaneously
  • Good verbal and written English communication skills

Further Information

If interested in this internship position, please apply via the button 'apply for this job' below.
Please note that you need to be a student to apply for this internship position!  

People Who Can

We’ve always believed that it takes special people to make a difference. They need ingenuity, brilliance and the freedom to turn their ideas into reality. At SABIC, this is our culture. We find open-minded, gifted and questioning people. Then we create the conditions for their talent to flourish: fulfilling work environments, state-of-the-art technology and the empowerment they need. So when you work for SABIC, you’re the person who can find new solutions and possibilities that change the world for the better.
SABIC, headquartered in Riyadh (Saudi Arabia), is one of the world’s leading manufacturers of Petrochemicals, Specialties, Agri-Nutrients, and Metals. We supply these materials to other companies, who use them to make products on which the world has come to depend. At SABIC, we believe in using chemistry to power the ambitions of our customers, business partners, employees and societies we live in. We call this “CHEMISTRY THAT MATTERS™”.
In Europe, SABIC employs over 5,000 professionals and is running 6 world-scale production sites and 4 technology & innovation centers. We are present in almost every European country. Major locations within our broad network of innovation, production and marketing are: Teesside (UK), Cartagena (Spain), Gelsenkirchen (Germany), Geleen (Netherlands) Bergen op Zoom (The Netherlands) and Genk (Belgium). Based upon this strong fundament, we are a key producer of olefins, polyolefins, engineering thermoplastics resins, films and sheets. Discover more at