Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add More Grammars to outlines.grammars, Benchmark and Verify Their Integrity #587

Closed
wants to merge 85 commits into from

Conversation

lapp0
Copy link
Contributor

@lapp0 lapp0 commented Jan 25, 2024

Overview:

  • Adds grammars to outlines.grammars and fixes some grammars
  • Adds benchmarks for CFGFSM (pytest --benchmark-cfg)
  • Fixes UnexpectedToken error resulting from a partially complete terminal being lexed as a full terminal (see fsm.py)

There are a lot of files changed in this PR, but they're mostly new lark grammars, lark grammar improvements, and many sample test files.

  • Python changes:
    • Add test cases
    • update outlines/grammars.py with new grammars
    • Prevent UnexpectedToken if terminal is partially generated and misrecognized as being a different terminal in fsm.py
  • Lark changes:
    • Fix grammars
    • Add grammars
  • Documentation:
    • Add documentation on creating grammars (Rendered)
    • Add documentation on using outlines.grammars
  • Add test generation files to outlines/tests/benchmark/cfg_samples/

Fixes:

Prerequisites

Create / Fix Grammars

  • outlines.grammars.csv
  • Fix ESCAPED_STRING in outlines.grammars.json
  • outlines.grammars.lark
  • outlines.grammars.sql_select (maybe reach out to https://motherduck.com/blog/duckdb-text2sql-llm/)
  • remove outlines.grammars.lisp
  • fix outlines.grammars.arithmetic recursion issue

Other Work

  • End to end + benchmark tests for grammars
  • Enable some lookarounds by changing how CFGFSM.regex_fsm is constructed in outlines/fsm/fsm.py
  • Fix ESCAPED_STRING in common.lark so it's compatible with interegular
  • Rename all test files e.g. foo.py.test so it doesn't confuse users searching the repo
  • test / bench samples for outlines.grammars.arithmetic
  • Document lookaround behavior for https://github.com/MegaIng/interegular/

Documentation Work

  • [ ] Documentation and examples of grammars in use
  • Documentation of how to create grammars

Out of scope because they require the introduction of context-sensitive indentation handling #592

  • outlines.grammars.python3 (and outlines.grammars.python3_interactive)
  • outlines.grammars.yaml

Out of scope otherwise

  • outlines.grammars.bash

Questions:

  • These tests take longer than all the other tests combined times 5. I suggest we by default only run one of these tests when pytest is run, and run all of them when pytest --do-benchmark is run. Running a single test will help ensure the changeset doesn't break CFG in general, without the need to test all grammars.
  • Should these be run within CI, or should PR authors manually run them when they're potentially impacting performance?

Output

Additional Benchmark Details:
tests/benchmark/test_benchmark_cfg_generation.py::test_benchmark_cfg_generation[lark_lark_self_grammar.lark.test-True]:
	Tokens / Second: 16.201
	(Num Tokens: 2771, Time: 171.035 seconds)
tests/benchmark/test_benchmark_cfg_generation.py::test_benchmark_cfg_generation[lark_lark_self_grammar.lark.test-False]:
	Tokens / Second: 15.405
	(Num Tokens: 2771, Time: 179.879 seconds)
tests/benchmark/test_benchmark_cfg_generation.py::test_benchmark_cfg_generation[sql_select_select_many_features.sql.test-True]:
	Tokens / Second: 18.946
	(Num Tokens: 1294, Time: 68.300 seconds)
tests/benchmark/test_benchmark_cfg_generation.py::test_benchmark_cfg_generation[sql_select_select_many_features.sql.test-False]:
	Tokens / Second: 11.907
	(Num Tokens: 1294, Time: 108.675 seconds)
tests/benchmark/test_benchmark_cfg_generation.py::test_benchmark_cfg_generation[sql_select_select_simple.sql.test-True]:
	Tokens / Second: 21.232
	(Num Tokens: 17, Time: 0.801 seconds)
tests/benchmark/test_benchmark_cfg_generation.py::test_benchmark_cfg_generation[sql_select_select_simple.sql.test-False]:
	Tokens / Second: 2.913
	(Num Tokens: 17, Time: 5.836 seconds)
tests/benchmark/test_benchmark_cfg_generation.py::test_benchmark_cfg_generation[sql_select_select_order.sql.test-True]:
	Tokens / Second: 14.272
	(Num Tokens: 47, Time: 3.293 seconds)
tests/benchmark/test_benchmark_cfg_generation.py::test_benchmark_cfg_generation[sql_select_select_order.sql.test-False]:
	Tokens / Second: 3.614
	(Num Tokens: 47, Time: 13.004 seconds)
tests/benchmark/test_benchmark_cfg_generation.py::test_benchmark_cfg_generation[sql_select_select_having.sql.test-True]:
	Tokens / Second: 19.313
	(Num Tokens: 219, Time: 11.339 seconds)
tests/benchmark/test_benchmark_cfg_generation.py::test_benchmark_cfg_generation[sql_select_select_having.sql.test-False]:
	Tokens / Second: 6.501
	(Num Tokens: 219, Time: 33.686 seconds)
tests/benchmark/test_benchmark_cfg_generation.py::test_benchmark_cfg_generation[sql_select_select_nested_subquery.sql.test-True]:
	Tokens / Second: 17.400
	(Num Tokens: 462, Time: 26.551 seconds)
tests/benchmark/test_benchmark_cfg_generation.py::test_benchmark_cfg_generation[sql_select_select_nested_subquery.sql.test-False]:
	Tokens / Second: 8.587
	(Num Tokens: 462, Time: 53.801 seconds)
tests/benchmark/test_benchmark_cfg_generation.py::test_benchmark_cfg_generation[sql_select_select_coalesce.sql.test-True]:
	Tokens / Second: 21.573
	(Num Tokens: 230, Time: 10.662 seconds)
tests/benchmark/test_benchmark_cfg_generation.py::test_benchmark_cfg_generation[sql_select_select_coalesce.sql.test-False]:
	Tokens / Second: 7.271
	(Num Tokens: 230, Time: 31.631 seconds)
tests/benchmark/test_benchmark_cfg_generation.py::test_benchmark_cfg_generation[sql_select_select_union.sql.test-True]:
	Tokens / Second: 20.007
	(Num Tokens: 195, Time: 9.747 seconds)
tests/benchmark/test_benchmark_cfg_generation.py::test_benchmark_cfg_generation[sql_select_select_union.sql.test-False]:
	Tokens / Second: 8.050
	(Num Tokens: 195, Time: 24.224 seconds)
tests/benchmark/test_benchmark_cfg_generation.py::test_benchmark_cfg_generation[arithmetic_lots_of_ops.arithmetic.test-True]:
	Tokens / Second: 21.246
	(Num Tokens: 515, Time: 24.240 seconds)
tests/benchmark/test_benchmark_cfg_generation.py::test_benchmark_cfg_generation[arithmetic_lots_of_ops.arithmetic.test-False]:
	Tokens / Second: 20.819
	(Num Tokens: 515, Time: 24.737 seconds)
tests/benchmark/test_benchmark_cfg_generation.py::test_benchmark_cfg_generation[arithmetic_simple_math.arithmetic.test-True]:
	Tokens / Second: 21.222
	(Num Tokens: 26, Time: 1.225 seconds)
tests/benchmark/test_benchmark_cfg_generation.py::test_benchmark_cfg_generation[arithmetic_simple_math.arithmetic.test-False]:
	Tokens / Second: 15.303
	(Num Tokens: 26, Time: 1.699 seconds)
tests/benchmark/test_benchmark_cfg_generation.py::test_benchmark_cfg_generation[json_simple_fruit.json.test-True]:
	Tokens / Second: 35.228
	(Num Tokens: 381, Time: 10.815 seconds)
tests/benchmark/test_benchmark_cfg_generation.py::test_benchmark_cfg_generation[json_simple_fruit.json.test-False]:
	Tokens / Second: 28.008
	(Num Tokens: 381, Time: 13.603 seconds)
tests/benchmark/test_benchmark_cfg_generation.py::test_benchmark_cfg_generation[json_outlines.generate.samplers.mypy.json.test-True]:
	Tokens / Second: 20.789
	(Num Tokens: 13620, Time: 655.139 seconds)
tests/benchmark/test_benchmark_cfg_generation.py::test_benchmark_cfg_generation[json_outlines.generate.samplers.mypy.json.test-False]:
	Tokens / Second: 17.473
	(Num Tokens: 13620, Time: 779.487 seconds)
tests/benchmark/test_benchmark_cfg_generation.py::test_benchmark_cfg_generation[json_simple_fruit_no_indent.json.test-True]:
	Tokens / Second: 30.341
	(Num Tokens: 197, Time: 6.493 seconds)
tests/benchmark/test_benchmark_cfg_generation.py::test_benchmark_cfg_generation[json_simple_fruit_no_indent.json.test-False]:
	Tokens / Second: 18.822
	(Num Tokens: 197, Time: 10.466 seconds)

-------------------------------------------------------------------------------------------------------------------------- benchmark: 26 tests ---------------------------------------------------------------------------------------------------------------------------
Name (time in ms)                                                                                Min                     Max                    Mean            StdDev                  Median               IQR            Outliers     OPS            Rounds  Iterations
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_benchmark_cfg_generation[sql_select_select_simple.sql.test-True]                       800.6749 (1.0)          800.6749 (1.0)          800.6749 (1.0)      0.0000 (1.0)          800.6749 (1.0)      0.0000 (1.0)           0;0  1.2489 (1.0)           1           1
test_benchmark_cfg_generation[arithmetic_simple_math.arithmetic.test-True]                1,225.1380 (1.53)       1,225.1380 (1.53)       1,225.1380 (1.53)     0.0000 (1.0)        1,225.1380 (1.53)     0.0000 (1.0)           0;0  0.8162 (0.65)          1           1
test_benchmark_cfg_generation[arithmetic_simple_math.arithmetic.test-False]               1,699.0278 (2.12)       1,699.0278 (2.12)       1,699.0278 (2.12)     0.0000 (1.0)        1,699.0278 (2.12)     0.0000 (1.0)           0;0  0.5886 (0.47)          1           1
test_benchmark_cfg_generation[sql_select_select_order.sql.test-True]                      3,293.2230 (4.11)       3,293.2230 (4.11)       3,293.2230 (4.11)     0.0000 (1.0)        3,293.2230 (4.11)     0.0000 (1.0)           0;0  0.3037 (0.24)          1           1
test_benchmark_cfg_generation[sql_select_select_simple.sql.test-False]                    5,836.3914 (7.29)       5,836.3914 (7.29)       5,836.3914 (7.29)     0.0000 (1.0)        5,836.3914 (7.29)     0.0000 (1.0)           0;0  0.1713 (0.14)          1           1
test_benchmark_cfg_generation[json_simple_fruit_no_indent.json.test-True]                 6,492.7985 (8.11)       6,492.7985 (8.11)       6,492.7985 (8.11)     0.0000 (1.0)        6,492.7985 (8.11)     0.0000 (1.0)           0;0  0.1540 (0.12)          1           1
test_benchmark_cfg_generation[sql_select_select_union.sql.test-True]                      9,746.7852 (12.17)      9,746.7852 (12.17)      9,746.7852 (12.17)    0.0000 (1.0)        9,746.7852 (12.17)    0.0000 (1.0)           0;0  0.1026 (0.08)          1           1
test_benchmark_cfg_generation[json_simple_fruit_no_indent.json.test-False]               10,466.2372 (13.07)     10,466.2372 (13.07)     10,466.2372 (13.07)    0.0000 (1.0)       10,466.2372 (13.07)    0.0000 (1.0)           0;0  0.0955 (0.08)          1           1
test_benchmark_cfg_generation[sql_select_select_coalesce.sql.test-True]                  10,661.6278 (13.32)     10,661.6278 (13.32)     10,661.6278 (13.32)    0.0000 (1.0)       10,661.6278 (13.32)    0.0000 (1.0)           0;0  0.0938 (0.08)          1           1
test_benchmark_cfg_generation[json_simple_fruit.json.test-True]                          10,815.3759 (13.51)     10,815.3759 (13.51)     10,815.3759 (13.51)    0.0000 (1.0)       10,815.3759 (13.51)    0.0000 (1.0)           0;0  0.0925 (0.07)          1           1
test_benchmark_cfg_generation[sql_select_select_having.sql.test-True]                    11,339.4663 (14.16)     11,339.4663 (14.16)     11,339.4663 (14.16)    0.0000 (1.0)       11,339.4663 (14.16)    0.0000 (1.0)           0;0  0.0882 (0.07)          1           1
test_benchmark_cfg_generation[sql_select_select_order.sql.test-False]                    13,004.0021 (16.24)     13,004.0021 (16.24)     13,004.0021 (16.24)    0.0000 (1.0)       13,004.0021 (16.24)    0.0000 (1.0)           0;0  0.0769 (0.06)          1           1
test_benchmark_cfg_generation[json_simple_fruit.json.test-False]                         13,603.4424 (16.99)     13,603.4424 (16.99)     13,603.4424 (16.99)    0.0000 (1.0)       13,603.4424 (16.99)    0.0000 (1.0)           0;0  0.0735 (0.06)          1           1
test_benchmark_cfg_generation[sql_select_select_union.sql.test-False]                    24,224.4266 (30.26)     24,224.4266 (30.26)     24,224.4266 (30.26)    0.0000 (1.0)       24,224.4266 (30.26)    0.0000 (1.0)           0;0  0.0413 (0.03)          1           1
test_benchmark_cfg_generation[arithmetic_lots_of_ops.arithmetic.test-True]               24,239.8884 (30.27)     24,239.8884 (30.27)     24,239.8884 (30.27)    0.0000 (1.0)       24,239.8884 (30.27)    0.0000 (1.0)           0;0  0.0413 (0.03)          1           1
test_benchmark_cfg_generation[arithmetic_lots_of_ops.arithmetic.test-False]              24,736.5881 (30.89)     24,736.5881 (30.89)     24,736.5881 (30.89)    0.0000 (1.0)       24,736.5881 (30.89)    0.0000 (1.0)           0;0  0.0404 (0.03)          1           1
test_benchmark_cfg_generation[sql_select_select_nested_subquery.sql.test-True]           26,551.4368 (33.16)     26,551.4368 (33.16)     26,551.4368 (33.16)    0.0000 (1.0)       26,551.4368 (33.16)    0.0000 (1.0)           0;0  0.0377 (0.03)          1           1
test_benchmark_cfg_generation[sql_select_select_coalesce.sql.test-False]                 31,631.2438 (39.51)     31,631.2438 (39.51)     31,631.2438 (39.51)    0.0000 (1.0)       31,631.2438 (39.51)    0.0000 (1.0)           0;0  0.0316 (0.03)          1           1
test_benchmark_cfg_generation[sql_select_select_having.sql.test-False]                   33,686.4307 (42.07)     33,686.4307 (42.07)     33,686.4307 (42.07)    0.0000 (1.0)       33,686.4307 (42.07)    0.0000 (1.0)           0;0  0.0297 (0.02)          1           1
test_benchmark_cfg_generation[sql_select_select_nested_subquery.sql.test-False]          53,801.4904 (67.20)     53,801.4904 (67.20)     53,801.4904 (67.20)    0.0000 (1.0)       53,801.4904 (67.20)    0.0000 (1.0)           0;0  0.0186 (0.01)          1           1
test_benchmark_cfg_generation[sql_select_select_many_features.sql.test-True]             68,299.6835 (85.30)     68,299.6835 (85.30)     68,299.6835 (85.30)    0.0000 (1.0)       68,299.6835 (85.30)    0.0000 (1.0)           0;0  0.0146 (0.01)          1           1
test_benchmark_cfg_generation[sql_select_select_many_features.sql.test-False]           108,675.0438 (135.73)   108,675.0438 (135.73)   108,675.0438 (135.73)   0.0000 (1.0)      108,675.0438 (135.73)   0.0000 (1.0)           0;0  0.0092 (0.01)          1           1
test_benchmark_cfg_generation[lark_lark_self_grammar.lark.test-True]                    171,034.5827 (213.61)   171,034.5827 (213.61)   171,034.5827 (213.61)   0.0000 (1.0)      171,034.5827 (213.61)   0.0000 (1.0)           0;0  0.0058 (0.00)          1           1
test_benchmark_cfg_generation[lark_lark_self_grammar.lark.test-False]                   179,879.0275 (224.66)   179,879.0275 (224.66)   179,879.0275 (224.66)   0.0000 (1.0)      179,879.0275 (224.66)   0.0000 (1.0)           0;0  0.0056 (0.00)          1           1
test_benchmark_cfg_generation[json_outlines.generate.samplers.mypy.json.test-True]      655,138.7894 (818.23)   655,138.7894 (818.23)   655,138.7894 (818.23)   0.0000 (1.0)      655,138.7894 (818.23)   0.0000 (1.0)           0;0  0.0015 (0.00)          1           1
test_benchmark_cfg_generation[json_outlines.generate.samplers.mypy.json.test-False]     779,487.4373 (973.54)   779,487.4373 (973.54)   779,487.4373 (973.54)   0.0000 (1.0)      779,487.4373 (973.54)   0.0000 (1.0)           0;0  0.0013 (0.00)          1           1
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Profile

$ pytest tests/benchmark/test_benchmark_cfg_generation.py::test_benchmark_cfg_generation[json_simple_fruit.json-False] --benchmark-cprofile=cumtime

<removed noise>

--------------------------------------------------------------------------------- cProfile (time in s) ---------------------------------------------------------------------------------
tests/benchmark/test_benchmark_cfg_generation.py::test_benchmark_cfg_generation[json_simple_fruit.json-False]
ncalls	tottime	percall	cumtime	percall	filename:lineno(function)
1	0.0000	0.0000	14.3733	14.3733	outlines/tests/benchmark/test_benchmark_cfg_generation.py:130(<lambda>)
1	0.1738	0.1738	14.3732	14.3732	outlines/tests/benchmark/test_benchmark_cfg_generation.py:51(run_until_eos)
327	0.1854	0.0006	14.1685	0.0433	outlines/.myenv/lib/python3.11/site-packages/outlines/fsm/fsm.py:222(allowed_token_ids)
208	0.7643	0.0037	13.7202	0.0660	outlines/.myenv/lib/python3.11/site-packages/outlines/fsm/fsm.py:94(__init__)
208	0.0036	0.0000	7.3470	0.0353	outlines/.myenv/lib/python3.11/site-packages/outlines/caching.py:64(wrapper)
989/860	5.6061	0.0065	5.6063	0.0065	~:0(<built-in method builtins.sorted>)
208	0.2597	0.0012	3.2571	0.0157	outlines/.myenv/lib/python3.11/site-packages/outlines/caching.py:39(hash_arguments)
8	0.0003	0.0000	3.1981	0.3998	outlines/.myenv/lib/python3.11/site-packages/outlines/fsm/fsm.py:95(create_states_mapping)
8	0.0158	0.0020	3.0903	0.3863	outlines/.myenv/lib/python3.11/site-packages/outlines/fsm/regex.py:558(create_fsm_index_tokenizer)
8	0.3168	0.0396	2.9276	0.3660	outlines/.myenv/lib/python3.11/site-packages/outlines/fsm/regex.py:494(create_fsm_index_end_to_end)
416	0.0037	0.0000	2.7863	0.0067	outlines/.myenv/lib/python3.11/site-packages/cloudpickle/cloudpickle.py:1464(dumps)
416	0.0007	0.0000	2.7790	0.0067	outlines/.myenv/lib/python3.11/site-packages/cloudpickle/cloudpickle.py:1243(dump)
416	2.7783	0.0067	2.7783	0.0067	~:0(<function Pickler.dump at 0x7f49efb03ba0>)
84	2.4870	0.0296	2.4873	0.0296	outlines/.myenv/lib/python3.11/site-packages/outlines/fsm/regex.py:461(state_scan_tokens)
200	0.0008	0.0000	0.8564	0.0043	outlines/.myenv/lib/python3.11/site-packages/diskcache/core.py:1224(__getitem__)
200	0.0018	0.0000	0.8556	0.0043	outlines/.myenv/lib/python3.11/site-packages/diskcache/core.py:1123(get)
200	0.0018	0.0000	0.8514	0.0043	outlines/.myenv/lib/python3.11/site-packages/diskcache/core.py:254(fetch)
200	0.8406	0.0042	0.8406	0.0042	~:0(<built-in method _pickle.load>)
416	0.2078	0.0005	0.2078	0.0005	~:0(<method 'update' of '_hashlib.HASH' objects>)
8	0.0001	0.0000	0.1214	0.0152	outlines/.myenv/lib/python3.11/site-packages/outlines/models/transformers.py:177(__hash__)
8	0.0000	0.0000	0.1213	0.0152	outlines/.myenv/lib/python3.11/site-packages/datasets/fingerprint.py:226(hash)
8	0.0000	0.0000	0.1203	0.0150	outlines/.myenv/lib/python3.11/site-packages/datasets/utils/_dill.py:105(dumps)
8	0.0001	0.0000	0.1203	0.0150	outlines/.myenv/lib/python3.11/site-packages/datasets/utils/_dill.py:100(dump)
8	0.0000	0.0000	0.1201	0.0150	outlines/.myenv/lib/python3.11/site-packages/dill/_dill.py:416(dump)
8	0.0001	0.0000	0.1200	0.0150	/nix/store/qp5zys77biz7imbk6yy85q5pdv7qk84j-python3-3.11.6/lib/python3.11/pickle.py:476(dump)
825	0.1125	0.0001	0.1125	0.0001	~:0(<method '__reduce_ex__' of 'object' objects>)
1050404	0.0843	0.0000	0.0843	0.0000	~:0(<method 'add' of 'set' objects>)
825601	0.0397	0.0000	0.0397	0.0000	~:0(<method 'setdefault' of 'dict' objects>)
534	0.0320	0.0001	0.0324	0.0001	outlines/.myenv/lib/python3.11/site-packages/outlines/fsm/fsm.py:128(allowed_token_ids)
17970/601	0.0269	0.0000	0.0740	0.0001	/nix/store/qp5zys77biz7imbk6yy85q5pdv7qk84j-python3-3.11.6/lib/python3.11/copy.py:128(deepcopy)
8	0.0237	0.0030	0.0237	0.0030	outlines/.myenv/lib/python3.11/site-packages/outlines/fsm/regex.py:584(<dictcomp>)
6204	0.0214	0.0000	0.0333	0.0000	outlines/.myenv/lib/python3.11/site-packages/lark/parsers/lalr_parser_state.py:67(feed_token)
5811	0.0188	0.0000	0.0707	0.0000	outlines/.myenv/lib/python3.11/site-packages/lark/lexer.py:590(next_token)
730	0.0162	0.0000	0.0463	0.0001	outlines/.myenv/lib/python3.11/site-packages/interegular/fsm.py:972(crawl)
8838	0.0125	0.0000	0.0125	0.0000	~:0(<method 'match' of '_regex.Pattern' objects>)
440	0.0109	0.0000	0.0109	0.0000	~:0(<method 'execute' of 'sqlite3.Connection' objects>)
8	0.0108	0.0013	0.0108	0.0013	~:0(<built-in method _pickle.dumps>)
15053	0.0106	0.0000	0.0142	0.0000	outlines/.myenv/lib/python3.11/site-packages/lark/lexer.py:213(_future_new)
11867	0.0081	0.0000	0.0090	0.0000	outlines/.myenv/lib/python3.11/site-packages/interegular/fsm.py:350(follow)
76602	0.0052	0.0000	0.0052	0.0000	~:0(<method 'append' of 'list' objects>)
48965	0.0043	0.0000	0.0043	0.0000	~:0(<method 'get' of 'dict' objects>)
48586	0.0039	0.0000	0.0039	0.0000	~:0(<built-in method builtins.id>)
45129	0.0035	0.0000	0.0035	0.0000	~:0(<built-in method builtins.len>)
27369	0.0046	0.0000	0.0094	0.0000	outlines/.myenv/lib/python3.11/site-packages/interegular/fsm.py:117(<genexpr>)
23214	0.0038	0.0000	0.0055	0.0000	outlines/.myenv/lib/python3.11/site-packages/interegular/fsm.py:103(__getitem__)
19879	0.0037	0.0000	0.0048	0.0000	~:0(<built-in method builtins.isinstance>)
19423	0.0037	0.0000	0.0037	0.0000	~:0(<built-in method builtins.getattr>)
17970	0.0076	0.0000	0.0101	0.0000	/nix/store/qp5zys77biz7imbk6yy85q5pdv7qk84j-python3-3.11.6/lib/python3.11/copy.py:243(_keep_alive)
16402	0.0016	0.0000	0.0016	0.0000	~:0(<built-in method builtins.issubclass>)
15894	0.0037	0.0000	0.0037	0.0000	~:0(<built-in method __new__ of type object at 0x7f4a8f1b1ba0>)
15053	0.0079	0.0000	0.0221	0.0000	outlines/.myenv/lib/python3.11/site-packages/lark/lexer.py:202(__new__)
8849	0.0051	0.0000	0.0164	0.0000	outlines/.myenv/lib/python3.11/site-packages/lark/lexer.py:262(__deepcopy__)
8799	0.0059	0.0000	0.0059	0.0000	outlines/.myenv/lib/python3.11/site-packages/interegular/fsm.py:928(follow)
8784	0.0080	0.0000	0.0220	0.0000	outlines/.myenv/lib/python3.11/site-packages/lark/lexer.py:387(match)
8784	0.0070	0.0000	0.0095	0.0000	outlines/.myenv/lib/python3.11/site-packages/lark/lexer.py:292(feed)
8784	0.0055	0.0000	0.0313	0.0000	outlines/.myenv/lib/python3.11/site-packages/lark/lexer.py:587(match)
8784	0.0018	0.0000	0.0039	0.0000	outlines/.myenv/lib/python3.11/site-packages/lark/lexer.py:581(scanner)
8784	0.0016	0.0000	0.0016	0.0000	~:0(<method 'group' of '_regex.Match' objects>)
99	0.0002	0.0000	0.0010	0.0000	outlines/.myenv/lib/python3.11/site-packages/interegular/patterns.py:116(_get_alphabet)
99	0.0002	0.0000	0.0002	0.0000	outlines/.myenv/lib/python3.11/site-packages/interegular/fsm.py:133(<dictcomp>)
99	0.0001	0.0000	0.0006	0.0000	outlines/.myenv/lib/python3.11/site-packages/interegular/fsm.py:131(from_groups)
99	0.0000	0.0000	0.0000	0.0000	outlines/.myenv/lib/python3.11/site-packages/interegular/patterns.py:126(_get_lengths)
99	0.0000	0.0000	0.0000	0.0000	outlines/.myenv/lib/python3.11/site-packages/interegular/patterns.py:168(simplify)
983	0.0002	0.0000	0.0002	0.0000	~:0(<method 'isupper' of 'str' objects>)
981	0.0012	0.0000	0.0013	0.0000	outlines/.myenv/lib/python3.11/site-packages/interegular/fsm.py:191(__init__)
98/32	0.0001	0.0000	0.0004	0.0000	outlines/.myenv/lib/python3.11/site-packages/interegular/patterns.py:361(_get_lengths)
95	0.0001	0.0000	0.0001	0.0000	outlines/.myenv/lib/python3.11/site-packages/interegular/patterns.py:157(<dictcomp>)
936	0.0008	0.0000	0.0008	0.0000	~:0(<method 'write' of '_io.BytesIO' objects>)
92/8	0.0001	0.0000	0.0143	0.0018	outlines/.myenv/lib/python3.11/site-packages/interegular/patterns.py:517(group)
90	0.0000	0.0000	0.0000	0.0000	outlines/.myenv/lib/python3.11/site-packages/lark/lexer.py:70(_get_flags)
888	0.0003	0.0000	0.0004	0.0000	/nix/store/qp5zys77biz7imbk6yy85q5pdv7qk84j-python3-3.11.6/lib/python3.11/pickle.py:241(write)
88	0.0027	0.0000	0.0032	0.0000	outlines/.myenv/lib/python3.11/site-packages/interegular/patterns.py:706(chargroup_inner)
88	0.0000	0.0000	0.0000	0.0000	~:0(<method 'end' of 're.Match' objects>)
86/8	0.0003	0.0000	0.0852	0.0106	outlines/.myenv/lib/python3.11/site-packages/interegular/patterns.py:447(to_fsm)
4	0.0015	0.0004	0.0042	0.0010	outlines/.myenv/lib/python3.11/site-packages/diskcache/core.py:230(_write)
1434/8	0.0019	0.0002	0.0147	0.0018	outlines/.myenv/lib/python3.11/site-packages/interegular/utils/simple_parser.py:34(w)
696/8	0.0016	0.0002	0.1196	0.0150	/nix/store/qp5zys77biz7imbk6yy85q5pdv7qk84j-python3-3.11.6/lib/python3.11/pickle.py:535(save)
144/8	0.0010	0.0001	0.0145	0.0018	outlines/.myenv/lib/python3.11/site-packages/interegular/patterns.py:503(conc)
8	0.0008	0.0001	0.0008	0.0001	~:0(<method 'update' of 'xxhash.xxh64' objects>)
355/8	0.0007	0.0001	0.0106	0.0013	outlines/.myenv/lib/python3.11/site-packages/interegular/patterns.py:69(get_alphabet)
696/8	0.0007	0.0001	0.1198	0.0150	outlines/.myenv/lib/python3.11/site-packages/datasets/utils/_dill.py:30(save)
12	0.0008	0.0001	0.0008	0.0001	outlines/.myenv/lib/python3.11/site-packages/interegular/patterns.py:172(_combine_char_groups)
696/8	0.0005	0.0001	0.1198	0.0150	outlines/.myenv/lib/python3.11/site-packages/dill/_dill.py:365(save)
339/16	0.0009	0.0001	0.0145	0.0009	outlines/.myenv/lib/python3.11/site-packages/interegular/patterns.py:512(obj)
8	0.0001	0.0000	0.1195	0.0149	outlines/.myenv/lib/python3.11/site-packages/datasets/utils/_dill.py:197(_save_transformersPreTrainedTokenizerBase)
24/8	0.0002	0.0000	0.1190	0.0149	/nix/store/qp5zys77biz7imbk6yy85q5pdv7qk84j-python3-3.11.6/lib/python3.11/pickle.py:621(save_reduce)
24/8	0.0001	0.0000	0.1182	0.0148	outlines/.myenv/lib/python3.11/site-packages/dill/_dill.py:1190(save_module_dict)
24/8	0.0001	0.0000	0.1181	0.0148	/nix/store/qp5zys77biz7imbk6yy85q5pdv7qk84j-python3-3.11.6/lib/python3.11/pickle.py:965(save_dict)
24/8	0.0000	0.0000	0.1180	0.0148	outlines/.myenv/lib/python3.11/site-packages/datasets/utils/_dill.py:71(_batch_setitems)
24/8	0.0002	0.0000	0.1180	0.0147	/nix/store/qp5zys77biz7imbk6yy85q5pdv7qk84j-python3-3.11.6/lib/python3.11/pickle.py:978(_batch_setitems)
3814	0.0018	0.0000	0.0018	0.0000	~:0(<method 'write' of '_io.BufferedWriter' objects>)
231	0.0002	0.0000	0.0002	0.0000	~:0(<method 'values' of 'dict' objects>)
8162	0.0006	0.0000	0.0006	0.0000	~:0(<method 'update' of 'set' objects>)
24	0.0000	0.0000	0.0000	0.0000	~:0(<method 'union' of 'set' objects>)
468	0.0006	0.0000	0.0011	0.0000	~:0(<method 'union' of 'frozenset' objects>)
21	0.0000	0.0000	0.0000	0.0000	~:0(<method 'translate' of 'str' objects>)
720	0.0001	0.0000	0.0001	0.0000	~:0(<method 'tell' of '_io.BytesIO' objects>)
1825	0.0010	0.0000	0.0010	0.0000	~:0(<method 'startswith' of 'str' objects>)
64	0.0000	0.0000	0.0000	0.0000	~:0(<method 'split' of 'str' objects>)
12	0.0000	0.0000	0.0000	0.0000	~:0(<method 'rstrip' of 'str' objects>)
24	0.0000	0.0000	0.0000	0.0000	~:0(<method 'rpartition' of 'str' objects>)
1981	0.0005	0.0000	0.0005	0.0000	~:0(<method 'rindex' of 'str' objects>)
12	0.0000	0.0000	0.0000	0.0000	~:0(<method 'rfind' of 'str' objects>)
3260	0.0003	0.0000	0.0003	0.0000	~:0(<method 'replace' of 'str' objects>)
2	0.0000	0.0000	0.0000	0.0000	~:0(<method 'remove' of 'set' objects>)
84	0.0000	0.0000	0.0000	0.0000	~:0(<method 'pop' of 'set' objects>)
62	0.0000	0.0000	0.0000	0.0000	~:0(<method 'pop' of 'list' objects>)
350	0.0001	0.0000	0.0001	0.0000	~:0(<method 'pop' of 'dict' objects>)
120	0.0001	0.0000	0.0001	0.0000	~:0(<method 'match' of 're.Pattern' objects>)
1750	0.0003	0.0000	0.0003	0.0000	~:0(<method 'keys' of 'dict' objects>)

@rlouf rlouf added enhancement structured generation Linked to structured generation grammar labels Jan 26, 2024
@lapp0 lapp0 mentioned this pull request Feb 11, 2024
@rlouf
Copy link
Member

rlouf commented Mar 18, 2024

Closing due to inactivity. Feel free to re-open.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement grammar structured generation Linked to structured generation
Projects
None yet
3 participants