Last active
February 2, 2025 23:11
-
-
Save fswair/d9891cfd5dc58368146c7edf061f9188 to your computer and use it in GitHub Desktop.
llama code interact
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| # thoughts | |
| # 1) if you do not use it for more than one time, do not define a variable | |
| # 2) add detailed annotations to understanding code and best editor support | |
| # 3) do not use f-string 4 times, you can apply them all in one f-string expression | |
| # # 3.1) if you did not change sep parameter, then you dont need to send strings one by one | |
| # 4) you can use f-string expression to multiply characters | |
| def correctness_reward_func2(prompts: List[List[Mapping[str, str]]], completions: List[Mapping[str, str]], answer: List[Any], **kwargs) -> list[float]: | |
| responses = [completion[0]['content'] for completion in completions] | |
| extracted_responses = [extract_xml_answer(r) for r in responses] | |
| print(f"{'':->20} Question:\n{prompts[0][-1]['content']}\nAnswer:\n{answer[0]}\nResponse:\n{responses[0]}\nExtracted:\n{extracted_responses[0]}") | |
| return [2.0 if r == a else 0.0 for r, a in zip(extracted_responses, answer)] |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment