Facts about the course
- Study points:
- Responsible department:
- Faculty of Logistics
- Course Leader:
- Yury Redutskiy
- Lecture Semester:
- Teaching language:
- ½ year
IDA720 Applied Data Analytics (Spring 2020)
About the course
The growth of digital technologies facilitates the collection of large volumes of data useful for business analytics to improve the decision-making and the performance of various business activities. Data analytics is the process of examining data sets in order to draw conclusions regarding the information they contain with the aid of specialized systems and software.
To achieve benefits, the data sets examination is conducted with a specified goal, or in other words, with a properly formulated practical question. When the question is determined, the steps of data analysis include capturing the needed data from various sources, cleaning, preparing, and aligning it before further formal analysis may be conducted and its results may be interpreted and finally, communicated to the appropriate audience to determine the best course of action.
The formal modelling is often carried out with the purpose of making predictions. In many cases, however, the formal analysis of datasets is aimed at uncovering some unobserved facts about the data, discover patterns or structures. A significant challenge which arises while working with datasets in many real-life settings is big data, when data of various nature is collected in huge volumes, usually in real-time, thereby quickly growing and resulting in such datasets that cannot be handled by the traditional techniques and software tools for data processing. The examples of it may be found in production/manufacturing with sensor data gathered in real-time, city traffic data, data from meters in electric grids, social media, etc. Sampling, i.e. selecting the necessary observations from the larger data set, is often used to infer patterns, relationships, and dependencies from big data or to make predictions and ultimately answer the relevant question of interest.
Once the data has been explored and the analysis results has been justified with the use of both traditional statistical models or cutting-edge algorithmic approaches, the results may be communicated and turned into actionable knowledge.
The course is connected to the following study programs
Completing LOG708 Applied Statistics with SPSS is beneficial, however, this prerequisite is not compulsory. Candidates with any background may be accepted.
The student's learning outcomes after completing the course
The candidates will be able to properly organize the data analysis project.
- Within a given business area, the students will be able to formulate a practical question of interest which may be answered with an existing dataset or a dataset that may be procured.
- They will have the skills to retrieve the data in various formats from various sources, such as files stored locally and remotely, web application programming interfaces (Web APIs), databases, etc. They will be able to work with various file formats, e.g., text files (txt, rtf, docx), tabular records (csv, xls), markup files (xml, json) and others.
- They will have the necessary skills to explore how the available data fits the question of interest.
- They will have the skills to select the necessary data, reorganize, transform and clean the data (reconcile the situations with missing, incorrect or inconsistent data). They will be able to manipulate the arrays of various data types: numerical (integer, floating point and Boolean), date and time, text, etc.
- They will be able to select the appropriate approach to formal modelling, as well as being able to evaluate the applicability of the chosen model. For the problems of predictive nature, the students will be able to apply such methods as linear and nonlinear regression, neural networks, random forests, boosting, time-series analysis, etc. For the problems of revealing patterns in the data, the students will have the skills to employ such approaches as hierarchical clustering, principal components analysis, factor analysis, and others.
- Upon obtaining the results and diagnosing the fitness of the analytical approach, the students will be able to interpret, visualize and communicate the results to the appropriate audience.
Generally, the candidates will have the appropriate skills to turn a practical issue relevant for a given business activity into a question that may be addressed via tools and methods of data analytics, and then apply the results of the analysis for decision-making.
Forms of teaching and learning
Three hours of lectures per week.
Coursework requirements - conditions for taking the exam
- Mandatory coursework: Assignment(s)
- Courseworks given: 1
- Courseworks required: 1
- Presence: Required
- Comment: One obligatory homework assignment will be evaluated with a character grade (A – F) and it will contribute 30% to the final grade.
- Form of assessment: Digital school assessment - Insperia
- Proportion: 70%
- Duration: 4 Hours
- Grouping: Individual
- Grading scale: Letter (A - F)
- Support material: Calculator that may contain data + general dictionary in mother tongue/Norwegian/English in paper version
Several homework assignments will be handed out to practice solving the data analysis problems over the course of the semester. One obligatory homework assignment will be evaluated with a character grade (A – F) and it will contribute 30% to the final grade. Digital exam at the end of the semester will also be evaluated with a character (A – F) and it will contribute 70% to the final grade for this course.
- Kuhn, Max and Kjell Johnson. 2013. Applied Predictive Modeling. Springer.
- O'Neil, Cathy and Rachel Schutt. 2013. Doing Data Science: Straight Talk from the Frontline. O'Reilly Media.
In addition to the textbooks and lecture notes, other relevant material will be available for students in Canvas at the beginning of the semester.