The  students  have  an  overview  of  material  informatics,  its  main  components,  methods  and  tools.  Lecture  and  hands-on  cover  a  basic  introduction  into  computational  algorithms,  database  design,  machine  learning  and  statistical  analysis  methods  as  well  as  their  mathematical  foundations.  The
main  goals  of  this  course  will  focus  on  the  analysis  of  data-driven  modeling  strategies  of  several application problems to identify the most appropriate solution in each considered case. The course materials  cover  the  analysis  of  optimal  data  structure,  required  steps  in  data  preparation  process, selection  and  application  of  statistical  and  machine  learning  methods,  analysis  of  implementation environment  as  well  as  efficient  strategies  for  reporting  and  visualization  of  results.  All  exercises will be performed in R and the main differences to Python will be discussed. 

Subject aims:
- Introduction to Material informatics, its difference from other related disciplines
- Introduction to R, main data types and structures, main differences from Python
- Generic and user-defined functions
- Writing efficient code in R: comparison of possible solutions
- Efficient data management strategies: SQL vs. NoSQL database, existing materials databases 
- Data preparation and outlier detection
- Statistical and machine learning: common points and main differences
- Supervised and unsupervised statistical/machine learning
- Data visualization and reporting
- Computer-based materials design
Semester: WiSe 2024/25