Jason-ice-SCUT's picture
Update README.md
690344c verified
|
raw
history blame
1.39 kB
metadata
datasets:
  - code-search-net/code_search_net
base_model:
  - openai-community/gpt2
pipeline_tag: token-classification, text-generation, code-generation.
library_name: transformers
language:
  - en

Detailed Model Description

A GPT-2-based tokenizer further trained on 400 k+ Python functions. It keeps the original BPE backbone, adds robust encoding for indentation, common keywords, operators and camel-case variables, and is ready for any code-generation or code-understanding pipeline.

Usage Examples:

example = """class LinearLayer(): def init(self, input_size, output_size): self.weight = torch.randn(input_size, output_size) self.bias = torch.zeros(output_size)

def call(self, x): return x @ self.weights + self.bias """

Performance: ['class', 'ĠLinear', 'Layer', '():', 'ĊĠĠĠ', 'Ġdef', 'Ġ__', 'init', '(', 'self', ',', 'Ġinput', '', 'size', ',', 'Ġoutput', '', 'size', '):', 'ĊĠĠĠĠĠĠĠ', 'Ġself', '.', 'weight', 'Ġ=', 'Ġtorch', '.', 'randn', '(', 'input', '', 'size', ',', 'Ġoutput', '', 'size', ')', 'ĊĠĠĠĠĠĠ', 'Ġself', '.', 'bias', 'Ġ=', 'Ġtorch', '.', 'zeros', '(', 'output', '_', 'size', ')', 'ĊĊĠĠ', 'Ġdef', 'Ġ', 'call', '__(', 'self', ',', 'Ġx', '):', 'ĊĠĠĠĠĠĠ', 'Ġreturn', 'Ġx', 'Ġ@', 'Ġself', '.', 'weights', 'Ġ+', 'Ġself', '.', 'bias', 'ĊĠĠĠĠ']