Job Details

Safety Data Science Internship

Reference Number
Job Type
Vacancy Category
Research & Development


The GTC Europe team (~50 FTEs) is part of the SABIC Corporate Technology and Innovation (T&I) organization and provides EHS, Maintenance and Analytical characterization expertise to support technology programs across Strategic Business Units. Currently we are looking for a motivated and independent student for a 5 months data science internship at SABIC (Geleen, The Netherlands). The topic of Safety data science gives a unique opportunity to work on a highly relevant topic affecting the quality of people’s lives.

Job Purpose & Responsibilities

Safety Data Science

Safety of our employees is a top priority for SABIC. This project is envisioned to leverage our historical database of European safety observation data and incident reports in order to better visualize historical trends and ultimately to use the data in a predictive manner to reduce the frequency of safety incidents.

Non-structured data including written communication makes up ~80% of all data available today for businesses. In order to make sense of this data scientists have developed natural language processing tools and text mining procedures in order to leverage these rich sources.
 At SABIC and in fact all industry safety data is predominately non-structured. The data is available in three main types; a) Incident reports – detailed information about safety incidents, b) concern reports – voluntarily logged information about unsafe situations, C) observation reports – based on a system of peer observation for safety improvement advice.

The working hypothesis is that data sets b and c above may be used in a predictive capacity for data set a. To this end text mining on the concern and observation data will be performed. The data set will therefore be cleaned and prepared, this may include for example tokenisation steps, stemming or lemmatisation, the removal of noise terms, combination of words and development of a document term matrices. This work and some of the more obvious correlations of data with for example time of incident occurrence, locations, the existing weather conditions etc. is expected to form the major part of the internship project.

Depending on results, the internship may lead to a subsequent MSc thesis project. This possibility will be assessed and defined during the internship.

With more advanced work in mind it is important to remember the end goal of using EHS data in a predictive capacity. As EHS culture is naturally assumed to be a leading indicator for EHS performance specific attention will be placed in more advanced stages of the project towards text mining descriptors that are culturally relevant. With support from EHS colleagues the data will be scanned/coded for entries that correspond to a poor EHS culture. Once a matrix exists of culturally relevant text entries verses time resolved frequency, several correlation and prediction procedures are possible.

The coded frequency data could be initially correlated using non parametric regression analysis where culturally relevant terms from a previous set time period are used to correlate for a current time period. Of course some important meta data such as times, locations, employee experience, closeness of event to public holidays, important company announcements and business performance may all be investigated to check if they can improve any correlation. Another option is to develop, train and validate an artificial neural network to automatically calculate an expected incident rate based on past EHS cultural performance. Of course these are not the only possibilities and there is room for self-thought and innovation from the student.


  •  Data Science background with the ability to manipulate large datasets
  • Good command of R and Python
  • Interest and ability to learn safety concepts
  • Flexible, creative and independent
  • Good interpersonal skills
  • Ability to organize, prioritize and handle multiple project aspects simultaneously
  • Good verbal and written English communication skills
  • Dutch language skills would be helpful

Further Information

For more information about this role you can contact:Christian Wold, Chief Scientist, tel 0031 (0)164291055

People Who Can

We’ve always believed that it takes special people to make a difference. They need ingenuity, brilliance and the freedom to turn their ideas into reality. At SABIC, this is our culture. We find open-minded, gifted and questioning people. Then we create the conditions for their talent to flourish: fulfilling work environments, state-of-the-art technology and the empowerment they need. So when you work for SABIC, you’re the person who can find new solutions and possibilities that change the world for the better.
SABIC, headquartered in Riyadh (Saudi Arabia), is one of the world’s leading manufacturers of Petrochemicals, Specialties, Agri-Nutrients, and Metals. We supply these materials to other companies, who use them to make products on which the world has come to depend. At SABIC, we believe in using chemistry to power the ambitions of our customers, business partners, employees and societies we live in. We call this “CHEMISTRY THAT MATTERS™”.
In Europe, SABIC employs over 5,000 professionals and is running 6 world-scale production sites and 4 technology & innovation centers. We are present in almost every European country. Major locations within our broad network of innovation, production and marketing are: Teesside (UK), Cartagena (Spain), Gelsenkirchen (Germany), Geleen (Netherlands) Bergen op Zoom (The Netherlands) and Genk (Belgium). Based upon this strong fundament, we are a key producer of olefins, polyolefins, engineering thermoplastics resins, films and sheets. Discover more at