The goal of this challenge is to to solve the electric power grid control problem using a reinforcement learning based approach. Participants are asked to design a reinforcement learning agent using tools and algorithms of their choice. The designed agents are supposed to learn a policy that maximizes the final score returned by the simulator, the possible actions at a given timestep being either switching a line status or changing the line interconnections. Indeed, we provide a simulation environement consisting of a power grid simulator along with a set of chronics. We use pypownet, an open-source power grid simulator developed by Marvin Lerousseau, that simulates the behaviour of a power grid of any given characterics subject to a set of external constraints governed by the given chronics. Samples of such chronics can be found under the sample_data directory.
We also provide a set of baselines solutions:
Typically on natiowide power grids, dispatchers use two types of physical actions:
- switching ON or OFF some power lines
- switching the nodes on which elements, such as productions, consumptions, or power lines, are connected within their substation
See Random Agent for an example of action.
Retrieve the Docker image:
docker pull mok0na/l2rpn:2.0
Retrieve the notebook README.ipynb to help you start this competition. It is available in the starting kit to download from the Files tab in the Partipate tab.
Download starting-kit and public data
unzip starting_kit -d
cp -r starting-kit ~/aux
Run the jupyter notebook:
docker run --name l2rpn -it -p 5000:8888 -v ~/aux:/home/aux mok0na/l2rpn:2.0 jupyter notebook --ip 0.0.0.0 --notebook-dir=/home/aux --allow-root
Open the link and replace the port 5000 instead of 8888. e.g. : http://127.0.0.1:5000/?token=2b4e492be2f542b1ed5a645fa2cfbedcff9e67d50bb35380
To reuse the docker : restart the docker and open the link http://127.0.0.1:5000
docker start l2rpn
The submission zip is in your local directory ~/aux
python ingestion_program/ingestion.py public_data input_data/res ingestion_program example_submission
Score the model
python scoring_program/evaluate.py input_data output
import pypownet.env import pypownet.agent class Submission(pypownet.agent.Agent): def __init__(self, environment): super().__init__(environment) assert isinstance(environment, pypownet.env.RunEnv) self.environment = environment def act(self, observation): """Produces an action given an observation of the environment. Takes as argument an observation of the current state, and returns the chosen action.""" # Sanity check: an observation is a structured object defined in the environment file. assert isinstance(observation, pypownet.env.RunEnv.Observation) # Implement your policy here. return None
Essentially, a submission should be a ZIP file containing at least these two files:
Upon reception of the challenger's submission, Codalab will see the metadata file (mandatory) and consider the submission of the participant as a code submission (contrary to results submission) and run the programs to process it. The folder containing both submission.py and metadata should then be zipped (the output name is not important). Then, on Codalab:
Codalab will take some time to process the submission, and will print the scores on the same page (after refresh).
References and credits:
Founder of pypownet was Marvin Lerousseau. The competition protocol was designed by Isabelle Guyon. Our mentors are Balthazar Donon and Antoine Marot. Pypownet, 2017. https://github.com/MarvinLer/pypownet. The baseline methods were inspired by work performed by Kimang Khun.
Mohamed Khalil Jabri, Léa-Marie Lam-Yee-Mui, Youcef Madji,Luca Veyrin-Forrer, Tingting Wang, Yacine Yakoubi
This competition is based on reinforcement learning. Here, the agent you will have to code will decide of the action to do at each timestep. The aim of the competition is to maximize a reward specifically designed for the problem. The reward is indeed the feedback which measures the action of an agent. For the competition, the reward is a sum of five subrewards :
The cumulative reward could be positive. Here is a plot of some cumulative rewards.
Further information can be found here
Submissions must be made before the end of phase 1.
This challenge is governed by the general ChaLearn contest rules.
Start: Nov. 15, 2018, midnight
Description: Development phase: tune your models and submit prediction results, trained model, or untrained model.
Start: May 30, 2019, midnight
Description: Final phase (no submission, your last submission from the previous phase is automatically forwarded).
You must be logged in to participate in competitions.Sign In