Require Pytorch version

#14

by tlphams - opened Jan 26, 2023

Jan 26, 2023

I have tried to run the model with Pytorch=1.9.0+cu111, but its generated text is bizarre with duplicated words. So I want to know about the requirement of torch version and other libraries. Thank you.

tlphams changed discussion status to closed Jan 26, 2023

loubnabnl

BigCode org Jan 26, 2023

•

edited Jan 26, 2023

Can you please share the code you used to generate text, Pytorch version shouldn't impact the generation. Something to pay attention to is not passing token_type_ids returned by the tokenizer to the model. Here's a working example to use the model both in standard and FIM settings:

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("bigcode/santacoder", trust_remote_code=True)
tokenizer = AutoTokenizer.from_pretrained("bigcode/santacoder")

#standard example
input_text ="def all_odd_elements((L):\n"
# example to do FIM, add fim special tokens: <fim-prefix>, <fim-middle> and <fim-suffix> 
input_text_fim = "<fim-prefix>def fib(n):<fim-suffix>    else:\n        return fib(n - 2) + fib(n - 1)<fim-middle>"

# tokenizer(inputs) returns inputs_ids, attention_mask and token_types_ids, the latter shouldn't be fed to the model
# so if you want to use model(**inputs) or model.generate(**inputs) make sure you add return_token_type_ids=False to not have it returned 

inputs = tokenizer(input_text, return_tensors="pt") # add return_token_type_ids=False for model(**inputs) 
inputs_fim = tokenizer(input_text_fim, return_tensors="pt")  # add return_token_type_ids=False for model(**inputs) 

outputs = model.generate(inputs["input_ids"], max_new_tokens=18)
outputs_fim = model.generate(inputs_fim["input_ids"], max_new_tokens=25)

generation = [tokenizer.decode(tensor, skip_special_tokens=False) for tensor in outputs]
generation_fim = [tokenizer.decode(tensor, skip_special_tokens=False) for tensor in outputs_fim]

print(f"Standard example:\n {generation[0]}")
print(f"FIM example:\n {generation_fim[0]}")

Standard example:
 def all_odd_elements((L):
    return all(x % 2!= 0 for x in L)


FIM example:
 <fim-prefix>def fib(n):<fim-suffix>    else:
        return fib(n - 2) + fib(n - 1)<fim-middle>
    if n == 0:
        return 0
    elif n == 1:
        return 1
<|endoftext|><fim-prefix>

tlphams

Jan 27, 2023

Yeah there is a mistake in my text generation code ^^ I have changed the code and it is working well now
Previously:

inputs = tokenizer(input_text, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=64)

I checked the README.md again and have changed it into

inputs = tokenizer.encode(input_text, return_tensors="pt")
outputs = model.generate(inputs, max_new_tokens=64)

I have just tried to add return_token_type_ids=False in the first case, too, and it also works. Thank you ^^

loubnabnl

BigCode org Jan 28, 2023

Great, you don't even need to specify return_token_type_ids=False now, we turned it off by default

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment