Data on - doubts off.

ABOUT

DATA ANALYTICS TEAM
We are a collaborative team engaged in a practical data analytics project. Through this project, we aim to deepen our skills in data analytics by gaining hands-on experience in real-world tasks. Our focus is on: collecting, cleaning, and preprocessing data, conducting basic data analysis and creating visualizations, developing project management skills through teamwork and practical implementation. We gather information from various reliable sources to support our work. This project provides us with an excellent opportunity to apply theoretical knowledge in practice and strengthen our competence in data-driven decision-making.

SOLUTION

In this project, we used the following tools

Tool/Library Description Homepage
NumPy Logo NumPy: A library for numerical computations in Python, supporting arrays and matrices. NumPy Homepage
Python Logo Python: The programming language used for this project. Python Homepage
Seaborn Logo Seaborn: A Python visualization library based on Matplotlib. Seaborn Homepage
Scikit Logo Scikit-learn: A machine learning library for Python. Scikit-learn Homepage
Pandas Logo Pandas: A data manipulation and analysis library for Python. Pandas Homepage
Jupyter Logo Jupyter Notebook: An interactive environment for running Python code. Jupyter Homepage
SimFin Logo SimFin: A platform for accessing financial data and analytics. SimFin Homepage

TEAM

Ville Nurminen

Ville Nurminen
Team Leader, Developer
LinkedIn GitHub

Olena Yermolchenko

Olena Yermolchenko
Developer
LinkedIn GitHub

EVENTS

GATE 1: Plan+Offer

Project developments:

  • Defining the objective
  • Data collection
  • Data storage and initial processing
  • Cleaning and structuring
  • Visualization
  • Data analysis
  • Documentation and reporting
  • Presentation of results

GATE 2: Demo

Cleaning and structuring:

  • Handling missing values, correcting data types,
    and organizing the dataset into a structured format
    using a pandas DataFrame to prepare it for further analysis.

Visualization:

  • Bar chart: distribution of market value by sector
  • Histogram: changes in stock prices
  • QQ plot: assessment of distribution normality
  • Tables: financial ratios and market values
  • Interactive dashboard.

DataAnalysis:

  • included linear regression, with corresponding
    calculations and graphical representations.

GATE 3: Demo

DataAnalysis (Machine learning models):

  • XGBoost tree model
  • LSTM model