Spaces:
Sleeping
title: RlveGym Environment Server
emoji: 📡
colorFrom: purple
colorTo: blue
sdk: docker
pinned: false
app_port: 8000
base_path: /web
tags:
- openenv
RlveGym Environment
This package contains a collection of 400 verifiable environments from RLVE-Gym, introduced by the paper RLVE: Scaling Up Reinforcement Learning for Language Models with Adaptive Verifiable Environments (original GitHub repository is here).
Quick Start
The simplest way to use RlveGym environment is through the RlveGymEnv class:
from RLVE_Gym import RlveGymAction, RlveGymEnv
try:
# Create environment from Docker image
RLVE_Gymenv = RlveGymEnv.from_docker_image("RLVE_Gym-env:latest")
# Reset
result = RLVE_Gymenv.reset()
print(f"Problem Prompt: {result.observation.problem_input}")
# Or:
print(f"Problem Prompt (from the environment's state): {RLVE_Gymenv.state().problem_input}")
# Send multiple outputs
outputs = [
"Wrong Format",
r"<answer>0</answer>", # Wrong Answer
r"<answer>" + str(RLVE_Gymenv.problem.parameter["reference_answer"]) + r"</answer>", # Correct Answer
]
for output in outputs:
result = RLVE_Gymenv.step(RlveGymAction(output = output))
print(f"Sent: '{output}'")
print(f"Result: `{result}`")
finally:
# Always clean up
RLVE_Gymenv.close()
That's it! The RlveGymEnv.from_docker_image() method handles:
- Starting the Docker container
- Waiting for the server to be ready
- Connecting to the environment
- Container cleanup when you call
close()
Building the Docker Image
Before using the environment, you need to build the Docker image:
# From project root
docker build -t RLVE_Gym-env:latest -f server/Dockerfile .
Deploying to Hugging Face Spaces
You can easily deploy your OpenEnv environment to Hugging Face Spaces using the openenv push command:
# From the environment directory (where openenv.yaml is located)
openenv push
# Or specify options
openenv push --namespace my-org --private
The openenv push command will:
- Validate that the directory is an OpenEnv environment (checks for
openenv.yaml) - Prepare a custom build for Hugging Face Docker space (enables web interface)
- Upload to Hugging Face (ensuring you're logged in)
Prerequisites
- Authenticate with Hugging Face: The command will prompt for login if not already authenticated
Options
--directory,-d: Directory containing the OpenEnv environment (defaults to current directory)--repo-id,-r: Repository ID in format 'username/repo-name' (defaults to 'username/env-name' from openenv.yaml)--base-image,-b: Base Docker image to use (overrides Dockerfile FROM)--private: Deploy the space as private (default: public)
Examples
# Push to your personal namespace (defaults to username/env-name from openenv.yaml)
openenv push
# Push to a specific repository
openenv push --repo-id my-org/my-env
# Push with a custom base image
openenv push --base-image ghcr.io/meta-pytorch/openenv-base:latest
# Push as a private space
openenv push --private
# Combine options
openenv push --repo-id my-org/my-env --base-image custom-base:latest --private
After deployment, your space will be available at:
https://huggingface.co/spaces/<repo-id>
The deployed space includes:
- Web Interface at
/web- Interactive UI for exploring the environment - API Documentation at
/docs- Full OpenAPI/Swagger interface - Health Check at
/health- Container health monitoring
Environment Details
Environment Initialization
Please check here for detailed usage:
environment_identifier(str) - The environment's identifier. Check here for detailed usage.difficulty(int) - The difficulty of generated problems.answer_markers(Tuple[str] of length 2) - How the environment extracts the final answer from a model output.seed(int) - The initial seed to use when generating the first problem. Wheneverreset()is called, the seed will be incremented by 1.
Action
RlveGymAction: Contains a single field
output(str) - The model's output to get verified.
State
RlveGymState:
seed(int) - The seed to use when runningreset().problem_input(Optional[str]) - The input of the problem; if it isNone, it means that the problem generation has not been run, or it failed.num_samples(int) andsum_accuracy(int) - The statistics of the result ofstep(action)so far for the current problem (the number of outputs sent to the verifier and the number of correct ones).
Observation
RlveGymObservation:
problem_input(Optional[str]) - The input of the problem; if it isNone, it means that the problem generation has not been run, or it failed.verifier_result(Optional[dict]) - Containsrewardas the raw reward,accuracyas the 0/1 correctness, andformat_scoreas the 0/1 format correctness.success(bool) -TrueorFalseindicates whether the operation succeeds.message(str) - The explanation ofsuccess.reward(Optional[float]) - The value isverifier_result["reward"].
Advanced Usage
Connecting to an Existing Server
If you already have an RlveGymEnv server running, you can connect directly:
from RLVE_Gym import RlveGymEnv
# Connect to existing server
RLVE_Gymenv = RlveGymEnv(base_url="<ENV_HTTP_URL_HERE>")
# Use as normal
result = RLVE_Gymenv.reset()
result = RLVE_Gymenv.step(RlveGymAction(output="Hello!"))
Note: When connecting to an existing server, RLVE_Gymenv.close() will NOT stop the server.
Development & Testing
Direct Environment Testing
Test the environment logic directly without starting the HTTP server:
# From the server directory
python3 server/RLVE_Gym_environment.py
This verifies that:
- Environment resets correctly
- Step executes actions properly
- State tracking works
- Rewards are calculated correctly
Running Locally
Run the server locally for development:
uvicorn server.app:app --reload