Project 2: Automatic coding for NACE classification

Authors
Affiliation

Julien PRAMIL

Meilame TAYEBJEE

Technical level Tasks
Beginner TO COMPLETE.
Intermediate TO COMPLETE.
Expert TO COMPLETE.

Introduction

What is the project about? What data, what methods, what skills?

1 Structure of the project

This project has five steps (listed in the banner at the top of the page):

  • data generation;
  • data preprocessing;
  • model fitting and evaluation;
  • model logging with MLflow;
  • deployment.

2 Initialization of the project

2.1 Clone the project with Git

NoteAttention

To get started with the project, begin by opening a Vscode-python service. To avoid problems later, it is imperative to modify two elements of the service configuration:

  • In the Networking tab of the configuration, check the box “Enable a custom service port”;
  • In the Kubernetes tab of the configuration, change the role to admin.

Create a VScode service on SSP Cloud. In the service, open a terminal by clicking on , then Terminal > New Terminal. Clone the project repository with:

git clone https://github.com/AIML4OS/funathon-project2.git

The project structure is as follows:

  • The .qmd files and the _quarto.yaml file are necessary to build the website ;
  • The file pyproject.toml describes the dependencies of the project ;
  • The solutions are available in the solutions folder;
  • TO BE COMPLETED DEPENDING ON THE PROJECT (Dockerfile, kubernetes…)/

2.2 Installation of dependencies

Install the project dependencies by running the following command in the terminal:

uv sync
NoteAttention

For all operations performed from the terminal, it is important that you are located at the root of the Git repository. You can check which folder you are in by looking at the terminal prompt: it should end with funathon2024_sujet2. If this is not the case, change your location with cd.

3 TODO:

  • introduce NACE2.1 classification