Skip to content

Commit 0e03e0e

Browse files
authored
chore: Restructure the integration tests (#646)
### Description - Split integration tests based on scope: - Actor tests - Apify API tests - Update `test_actor_with_crawler_reboot` to match crawlee `1.0.4` ### Issues Closes: #632 Closes: #650
1 parent 12fe18f commit 0e03e0e

30 files changed

+716
-711
lines changed

pyproject.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -36,7 +36,7 @@ keywords = [
3636
dependencies = [
3737
"apify-client>=2.2.0,<3.0.0",
3838
"apify-shared>=2.0.0,<3.0.0",
39-
"crawlee>=1.0.2,<2.0.0",
39+
"crawlee>=1.0.4,<2.0.0",
4040
"cachetools>=5.5.0",
4141
"cryptography>=42.0.0",
4242
"impit>=0.6.1",

tests/integration/README.md

Lines changed: 20 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -1,14 +1,25 @@
11
# Integration tests
22

3-
We have integration tests which build and run Actors using the Python SDK on the Apify Platform. To run these tests, you need to set the `APIFY_TEST_USER_API_TOKEN` environment variable to the API token of the Apify user you want to use for the tests, and then start them with `make integration-tests`.
3+
There are two different groups of integration tests in this repository:
4+
- Apify API integration tests. These test that the Apify SDK is correctly communicating with Apify API through Apify client.
5+
- Actor integration tests. These test that the Apify SDK can be used in Actors deployed to Apify platform. These are very high level tests, and they test communication with the API and correct interaction with the Apify platform.
46

5-
If you want to run the integration tests on a different environment than the main Apify Platform, you need to set the `APIFY_INTEGRATION_TESTS_API_URL` environment variable to the right URL to the Apify API you want to use.
7+
To run these tests, you need to set the `APIFY_TEST_USER_API_TOKEN` environment variable to the API token of the Apify user you want to use for the tests, and then start them with `make integration-tests`.
68

7-
## How to write tests
9+
## Apify API integration tests
10+
The tests are making real requests to the Apify API as opposed to the unit tests that are mocking such API calls. On the other hand they are faster than `Actor integration tests` as they do not require building and deploying the Actor. These test can be also fully debugged locally. Preferably try to write integration tests on this level if possible.
11+
12+
13+
## Actor integration tests
14+
We have integration tests which build and run Actors using the Python SDK on the Apify platform. These integration tests are slower than `Apify API integration tests` as they need to build and deploy Actors on the platform. Preferably try to write `Apify API integration tests` first, and only write `Actor integration tests` when you need to test something that can only be tested on the platform.
15+
16+
If you want to run the integration tests on a different environment than the main Apify platform, you need to set the `APIFY_INTEGRATION_TESTS_API_URL` environment variable to the right URL to the Apify API you want to use.
17+
18+
### How to write tests
819

920
There are two fixtures which you can use to write tests:
1021

11-
### `apify_client_async`
22+
#### `apify_client_async`
1223

1324
This fixture just gives you an instance of `ApifyClientAsync` configured with the right token and API URL, so you don't have to do that yourself.
1425

@@ -17,15 +28,15 @@ async def test_something(apify_client_async: ApifyClientAsync) -> None:
1728
assert await apify_client_async.user('me').get() is not None
1829
```
1930

20-
### `make_actor`
31+
#### `make_actor`
2132

22-
This fixture returns a factory function for creating Actors on the Apify Platform.
33+
This fixture returns a factory function for creating Actors on the Apify platform.
2334

2435
For the Actor source, the fixture takes the files from `tests/integration/actor_source_base`, builds the Apify SDK wheel from the current codebase, and adds the Actor source you passed to the fixture as an argument. You have to pass exactly one of the `main_func`, `main_py` and `source_files` arguments.
2536

2637
The created Actor will be uploaded to the platform, built there, and after the test finishes, it will be automatically deleted. If the Actor build fails, it will not be deleted, so that you can check why the build failed.
2738

28-
### Creating test Actor straight from a Python function
39+
#### Creating test Actor straight from a Python function
2940

3041
You can create Actors straight from a Python function. This is great because you can have the test Actor source code checked with the linter.
3142

@@ -66,7 +77,7 @@ async def test_something(
6677
assert run_result.status == 'SUCCEEDED'
6778
```
6879

69-
### Creating Actor from source files
80+
#### Creating Actor from source files
7081

7182
You can also pass the source files directly if you need something more complex (e.g. pass some fixed value to the Actor source code or use multiple source files).
7283

@@ -127,7 +138,7 @@ async def test_something(
127138
assert actor_run.status == 'SUCCEEDED'
128139
```
129140

130-
### Asserts
141+
#### Asserts
131142

132143
Since test Actors are not executed as standard pytest tests, we don't get introspection of assertion expressions. In case of failure, only a bare `AssertionError` is shown, without the left and right values. This means, we must include explicit assertion messages to aid potential debugging.
133144

tests/integration/actor/actor_source_base/src/__init__.py

Whitespace-only changes.
Lines changed: 322 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,322 @@
1+
from __future__ import annotations
2+
3+
import base64
4+
import inspect
5+
import os
6+
import subprocess
7+
import sys
8+
import textwrap
9+
from pathlib import Path
10+
from typing import TYPE_CHECKING, Any, Protocol
11+
12+
import pytest
13+
from filelock import FileLock
14+
15+
from apify_client import ApifyClient, ApifyClientAsync
16+
from apify_shared.consts import ActorJobStatus, ActorSourceType
17+
18+
from .._utils import generate_unique_resource_name
19+
from apify._models import ActorRun
20+
21+
if TYPE_CHECKING:
22+
from collections.abc import Awaitable, Callable, Coroutine, Iterator, Mapping
23+
from decimal import Decimal
24+
25+
from apify_client.clients.resource_clients import ActorClientAsync
26+
27+
_TOKEN_ENV_VAR = 'APIFY_TEST_USER_API_TOKEN'
28+
_API_URL_ENV_VAR = 'APIFY_INTEGRATION_TESTS_API_URL'
29+
_SDK_ROOT_PATH = Path(__file__).parent.parent.parent.parent.resolve()
30+
31+
32+
@pytest.fixture(scope='session')
33+
def sdk_wheel_path(tmp_path_factory: pytest.TempPathFactory, testrun_uid: str) -> Path:
34+
"""Build the package wheel if it hasn't been built yet, and return the path to the wheel."""
35+
# Make sure the wheel is not being built concurrently across all the pytest-xdist runners,
36+
# through locking the building process with a temp file.
37+
with FileLock(tmp_path_factory.getbasetemp().parent / 'sdk_wheel_build.lock'):
38+
# Make sure the wheel is built exactly once across across all the pytest-xdist runners,
39+
# through an indicator file saying that the wheel was already built.
40+
was_wheel_built_this_test_run_file = tmp_path_factory.getbasetemp() / f'wheel_was_built_in_run_{testrun_uid}'
41+
if not was_wheel_built_this_test_run_file.exists():
42+
subprocess.run(
43+
args='python -m build',
44+
cwd=_SDK_ROOT_PATH,
45+
shell=True,
46+
check=True,
47+
capture_output=True,
48+
)
49+
was_wheel_built_this_test_run_file.touch()
50+
51+
# Read the current package version, necessary for getting the right wheel filename.
52+
pyproject_toml_file = (_SDK_ROOT_PATH / 'pyproject.toml').read_text(encoding='utf-8')
53+
for line in pyproject_toml_file.splitlines():
54+
if line.startswith('version = '):
55+
delim = '"' if '"' in line else "'"
56+
sdk_version = line.split(delim)[1]
57+
break
58+
else:
59+
raise RuntimeError('Unable to find version string.')
60+
61+
wheel_path = _SDK_ROOT_PATH / 'dist' / f'apify-{sdk_version}-py3-none-any.whl'
62+
63+
# Just to be sure.
64+
assert wheel_path.exists()
65+
66+
return wheel_path
67+
68+
69+
@pytest.fixture(scope='session')
70+
def actor_base_source_files(sdk_wheel_path: Path) -> dict[str, str | bytes]:
71+
"""Create a dictionary of the base source files for a testing Actor.
72+
73+
It takes the files from `tests/integration/actor_source_base`, builds the Apify SDK wheel from
74+
the current codebase, and adds them all together in a dictionary.
75+
"""
76+
source_files: dict[str, str | bytes] = {}
77+
78+
# First read the actor_source_base files
79+
actor_source_base_path = _SDK_ROOT_PATH / 'tests/integration/actor/actor_source_base'
80+
81+
for path in actor_source_base_path.glob('**/*'):
82+
if not path.is_file():
83+
continue
84+
relative_path = str(path.relative_to(actor_source_base_path))
85+
try:
86+
source_files[relative_path] = path.read_text(encoding='utf-8')
87+
except ValueError:
88+
source_files[relative_path] = path.read_bytes()
89+
90+
sdk_wheel_file_name = sdk_wheel_path.name
91+
source_files[sdk_wheel_file_name] = sdk_wheel_path.read_bytes()
92+
93+
source_files['requirements.txt'] = str(source_files['requirements.txt']).replace(
94+
'APIFY_SDK_WHEEL_PLACEHOLDER', f'./{sdk_wheel_file_name}'
95+
)
96+
97+
current_major_minor_python_version = '.'.join([str(x) for x in sys.version_info[:2]])
98+
integration_tests_python_version = (
99+
os.getenv('INTEGRATION_TESTS_PYTHON_VERSION') or current_major_minor_python_version
100+
)
101+
source_files['Dockerfile'] = str(source_files['Dockerfile']).replace(
102+
'BASE_IMAGE_VERSION_PLACEHOLDER', integration_tests_python_version
103+
)
104+
105+
return source_files
106+
107+
108+
class MakeActorFunction(Protocol):
109+
"""A type for the `make_actor` fixture."""
110+
111+
def __call__(
112+
self,
113+
label: str,
114+
*,
115+
main_func: Callable | None = None,
116+
main_py: str | None = None,
117+
source_files: Mapping[str, str | bytes] | None = None,
118+
additional_requirements: list[str] | None = None,
119+
) -> Awaitable[ActorClientAsync]:
120+
"""Create a temporary Actor from the given main function or source files.
121+
122+
The Actor will be uploaded to the Apify Platform, built there, and after the test finishes, it will
123+
be automatically deleted.
124+
125+
You have to pass exactly one of the `main_func`, `main_py` and `source_files` arguments.
126+
127+
Args:
128+
label: The label which will be a part of the generated Actor name.
129+
main_func: The main function of the Actor.
130+
main_py: The `src/main.py` file of the Actor.
131+
source_files: A dictionary of the source files of the Actor.
132+
additional_requirements: A list of additional requirements to be added to the `requirements.txt`.
133+
134+
Returns:
135+
A resource client for the created Actor.
136+
"""
137+
138+
139+
@pytest.fixture(scope='session')
140+
def make_actor(
141+
actor_base_source_files: dict[str, str | bytes],
142+
apify_token: str,
143+
) -> Iterator[MakeActorFunction]:
144+
"""Fixture for creating temporary Actors for testing purposes.
145+
146+
This returns a function that creates a temporary Actor from the given main function or source files. The Actor
147+
will be uploaded to the Apify Platform, built there, and after the test finishes, it will be automatically deleted.
148+
"""
149+
actors_for_cleanup: list[str] = []
150+
151+
async def _make_actor(
152+
label: str,
153+
*,
154+
main_func: Callable | None = None,
155+
main_py: str | None = None,
156+
source_files: Mapping[str, str | bytes] | None = None,
157+
additional_requirements: list[str] | None = None,
158+
) -> ActorClientAsync:
159+
if not (main_func or main_py or source_files):
160+
raise TypeError('One of `main_func`, `main_py` or `source_files` arguments must be specified')
161+
162+
if (main_func and main_py) or (main_func and source_files) or (main_py and source_files):
163+
raise TypeError('Cannot specify more than one of `main_func`, `main_py` and `source_files` arguments')
164+
165+
client = ApifyClientAsync(token=apify_token, api_url=os.getenv(_API_URL_ENV_VAR))
166+
actor_name = generate_unique_resource_name(label)
167+
168+
# Get the source of main_func and convert it into a reasonable main_py file.
169+
if main_func:
170+
func_source = textwrap.dedent(inspect.getsource(main_func))
171+
func_source = func_source.replace(f'def {main_func.__name__}(', 'def main(')
172+
main_py = '\n'.join( # noqa: FLY002
173+
[
174+
'import asyncio',
175+
'',
176+
'from apify import Actor',
177+
'',
178+
'',
179+
'',
180+
func_source,
181+
]
182+
)
183+
184+
if main_py:
185+
source_files = {'src/main.py': main_py}
186+
187+
assert source_files is not None
188+
189+
# Copy the source files dict from the fixture so that we're not overwriting it, and merge the passed
190+
# argument in it.
191+
actor_source_files = actor_base_source_files.copy()
192+
actor_source_files.update(source_files)
193+
194+
if additional_requirements:
195+
# Get the current requirements.txt content (as a string).
196+
req_content = actor_source_files.get('requirements.txt', '')
197+
if isinstance(req_content, bytes):
198+
req_content = req_content.decode('utf-8')
199+
# Append the additional requirements, each on a new line.
200+
additional_reqs = '\n'.join(additional_requirements)
201+
req_content = req_content.strip() + '\n' + additional_reqs + '\n'
202+
actor_source_files['requirements.txt'] = req_content
203+
204+
# Reformat the source files in a format that the Apify API understands.
205+
source_files_for_api = []
206+
for file_name, file_contents in actor_source_files.items():
207+
if isinstance(file_contents, str):
208+
file_format = 'TEXT'
209+
if file_name.endswith('.py'):
210+
file_contents = textwrap.dedent(file_contents).lstrip() # noqa: PLW2901
211+
else:
212+
file_format = 'BASE64'
213+
file_contents = base64.b64encode(file_contents).decode('utf-8') # noqa: PLW2901
214+
215+
source_files_for_api.append(
216+
{
217+
'name': file_name,
218+
'format': file_format,
219+
'content': file_contents,
220+
}
221+
)
222+
223+
print(f'Creating Actor {actor_name}...')
224+
created_actor = await client.actors().create(
225+
name=actor_name,
226+
default_run_build='latest',
227+
default_run_memory_mbytes=256,
228+
default_run_timeout_secs=600,
229+
versions=[
230+
{
231+
'versionNumber': '0.0',
232+
'buildTag': 'latest',
233+
'sourceType': ActorSourceType.SOURCE_FILES,
234+
'sourceFiles': source_files_for_api,
235+
}
236+
],
237+
)
238+
239+
actor_client = client.actor(created_actor['id'])
240+
241+
print(f'Building Actor {actor_name}...')
242+
build_result = await actor_client.build(version_number='0.0')
243+
build_client = client.build(build_result['id'])
244+
build_client_result = await build_client.wait_for_finish(wait_secs=600)
245+
246+
assert build_client_result is not None
247+
assert build_client_result['status'] == ActorJobStatus.SUCCEEDED
248+
249+
# We only mark the client for cleanup if the build succeeded, so that if something goes wrong here,
250+
# you have a chance to check the error.
251+
actors_for_cleanup.append(created_actor['id'])
252+
253+
return actor_client
254+
255+
yield _make_actor
256+
257+
# Delete all the generated Actors.
258+
for actor_id in actors_for_cleanup:
259+
actor_client = ApifyClient(token=apify_token, api_url=os.getenv(_API_URL_ENV_VAR)).actor(actor_id)
260+
261+
if (actor := actor_client.get()) is not None:
262+
actor_client.update(
263+
pricing_infos=[
264+
*actor.get('pricingInfos', []),
265+
{
266+
'pricingModel': 'FREE',
267+
},
268+
]
269+
)
270+
271+
actor_client.delete()
272+
273+
274+
class RunActorFunction(Protocol):
275+
"""A type for the `run_actor` fixture."""
276+
277+
def __call__(
278+
self,
279+
actor: ActorClientAsync,
280+
*,
281+
run_input: Any = None,
282+
max_total_charge_usd: Decimal | None = None,
283+
) -> Coroutine[None, None, ActorRun]:
284+
"""Initiate an Actor run and wait for its completion.
285+
286+
Args:
287+
actor: Actor async client, in testing context usually created by `make_actor` fixture.
288+
run_input: Optional input for the Actor run.
289+
290+
Returns:
291+
Actor run result.
292+
"""
293+
294+
295+
@pytest.fixture(scope='session')
296+
def run_actor(apify_client_async: ApifyClientAsync) -> RunActorFunction:
297+
"""Fixture for calling an Actor run and waiting for its completion.
298+
299+
This fixture returns a function that initiates an Actor run with optional run input, waits for its completion,
300+
and retrieves the final result. It uses the `wait_for_finish` method with a timeout of 10 minutes.
301+
"""
302+
303+
async def _run_actor(
304+
actor: ActorClientAsync,
305+
*,
306+
run_input: Any = None,
307+
max_total_charge_usd: Decimal | None = None,
308+
) -> ActorRun:
309+
call_result = await actor.call(
310+
run_input=run_input,
311+
max_total_charge_usd=max_total_charge_usd,
312+
)
313+
314+
assert isinstance(call_result, dict), 'The result of ActorClientAsync.call() is not a dictionary.'
315+
assert 'id' in call_result, 'The result of ActorClientAsync.call() does not contain an ID.'
316+
317+
run_client = apify_client_async.run(call_result['id'])
318+
run_result = await run_client.wait_for_finish(wait_secs=600)
319+
320+
return ActorRun.model_validate(run_result)
321+
322+
return _run_actor

0 commit comments

Comments
 (0)