@@ -10,14 +10,42 @@ parent: Principles
1010
1111# Testing recommendations
1212
13- In the guide, we will classify two kingdoms of test: external and internal.
14- External tests view the module from the perspective of a user of the module, and
15- are concerned that the public-facing features behave as expected. Internal tests
16- view the module from the perspective of code inside of the module, and ensure
17- that the components that make up our package work as expected, and interact with
18- each other properly.
19-
20- ### Any test case is better than none
13+ In this guide, we will provide a roadmap and best-practices for creating test
14+ suites for python projects.
15+
16+ We will describe the most important types of test suites, the purposes they
17+ serve and differences between them. They will be presented in OutSide -> In
18+ order, which is our recommend approach. Starting with
19+ [ Public Interface tests] ( #user-interface-or-public-api-testing ) , which test your
20+ code from the perspective of your users, focusing on the behavior of the public
21+ interface and the Features that your project provides. Then we will cover
22+ [ Package Level Integration tests] ( #package-level-integration-tests ) , which test
23+ that the various parts of your package work together, and work with the other
24+ packages it depends on. Finally we will cover the venrable
25+ [ Unit Test] ( #unit-tests ) , which test the correctness of your code from a
26+ perspective internal to your codebase, tests individual units in isolation, and
27+ are optimized to run quickly and often.
28+
29+ These 3 test suites will cover the bulk of your testing needs and help get your
30+ project to a reliable and maintainable state. We will also discuss some more
31+ specialized and advanced types of test cases in our
32+ [ Taxonomy of Test Cases] ( #taxonomy-of-test-cases ) section.
33+
34+ ## Advantages of Testing
35+
36+ - Trustworthy code: Well tested code, is code that you can trust to behave as
37+ expected.
38+ - Living Documentation: A good test is a form of documentation, which tells us
39+ how the code is expected to behave, communicates the intent of the author, and
40+ is validated every time the test is run.
41+ - Preventing Failure: Tests provide safety against many ways code can fail, from
42+ errors in implementation, to unexpected changes in upstream dependencies.
43+ - Confidence when making changes: A thorough suite of tests allows developers to
44+ add features, fix bugs, and refactor code, with a degree of confidence that
45+ their changes do not break existing features, or cause unexpected
46+ side-effects.
47+
48+ ## Any test case is better than none
2149
2250When in doubt, write the test that makes sense at the time.
2351
@@ -31,7 +59,7 @@ bogged down in the taxonomy of test types. As you write and use your test suite,
3159the reason for classifying and sorting some types of tests into different test
3260suites will become apparent.
3361
34- ### As long as that test is correct...
62+ ## As long as that test is correct...
3563
3664It can be surprisingly easy to write a test that passes when it should fail,
3765especially when using complicated mocks and fixtures. The best way to avoid this
@@ -45,14 +73,20 @@ the test-case to make sure it fails when the code is broken.
4573 is better to write many test cases for a single function or class, than one
4674 giant case.
4775
48- ## External or outside-in testing
76+ ## User Interface and Public API testing
4977
5078A good place to start writing tests is from the perspective of a user of your
5179module or library, as described in the [ Test
5280Tutorial] ({% link pages/tutorials/test.md %}), and [ Testing with pytest
53- guide] ({% link pages/guides/pytest.md %}). These test cases live outside your
54- code, and include many styles or types of test that you may have heard of
55- (behavioral, fuzz, end-to-end, feature, etc., etc.).
81+ guide] ({% link pages/guides/pytest.md %}).
82+
83+ - These test cases live outside of your source code.
84+ - Test the code as you expect your users to interact with it.
85+ - Keep these tests simple, and easily readable, so that they provide good
86+ documentation when a user asks "how should I use this feature"
87+ - Focus on the supported use-case, and avoid extensive edge-case testing
88+ (edge-case and exhaustive input testing will be handled in a separate test
89+ suite)
5690
5791{: .highlight-title }
5892
@@ -63,80 +97,35 @@ code, and include many styles or types of test that you may have heard of
6397> your test suite(s) grow, taxonomy of test cases, the and the use/need for
6498> different kinds of tests will become more clear.
6599
66- ### Taxonomy of outside-in tests
67-
68- A non-exhaustive discussion of some common types of tests.
69-
70- ^_ ^ Dont Panic ^_ ^
71-
72- Depending on your project, you may not need many, or most of these kinds of
73- tests.
74-
75- - A library project probably does not need to test integration with
76- microservices.
77- - A library with no 3rd party dependencies, does not need test them.
78- - Fuzz testing is for critical code, that many users rely on.
79-
80- #### Behavioral, Feature, or Functional Tests:
81-
82- High-level tests, which ensure a specific feature works. Used for testing things
83- like:
84-
85- - Loading a file works
86- - Setting a debug flag results in debug messages being printed
87- - A configuration option affects the behavior of the code as expected
88-
89- #### Fuzz Tests
90-
91- Fuzz tests attempt to test the full range of possible inputs to a function. They
92- are good for finding edge-cases, where what should be valid input causes a
93- failure. [ Hypothesis] ( https://hypothesis.readthedocs.io/en/latest/ ) is an
94- excellent tool for this, and a lot of fun to use.
95-
96- - SLOW TESTS: fuzz tests can take a very long time to run, and should usually be
97- placed in a test suite which is run separately from faster tests.
98- [ see: fail fast] ( https://en.wikipedia.org/wiki/Fail-fast_system )
99- - Reserve fuzz testing for the few critical functions, where it really matters.
100-
101- #### Integration Tests
102-
103- The word "Integration" is a bit overloaded, and can refer to many levels of
104- interaction between your code, its dependencies, and external systems.
105-
106- - Code level
107- - Test the integration between your software and external / 3rd party
108- dependencies.
109- - Low-level testing of your code-base, where you run the code imported from
110- dependencies without mocking it.
100+ ## Project Level Integration Testing
111101
112- - Environment level
113- - Testing that your software works in the environments you plan to run it in.
114- - Running inside of a docker container
115- - Using GPU's or other specialized hardware
116- - Deploying it to cloud servers
117-
118- - System level
119- - Testing that it interacts with other software in a larger system.
120- - Interactions with other services, on local or cloud-based platforms
121- - Micro-service, Database, or API connections and interactions
102+ The term "Integration Test" is unfortunately overloaded, and used to describe
103+ testing that various components integrate with each other, at many levels of the
104+ system. These tests will loosely follow the "Detroit School" of test design.
122105
123- #### End to End Tests
106+ - Write tests which view the code from an outside-in perspective, like
107+ [ Public Interface] ( ) tests
108+ - Avoid Mocks/Fakes/Patches as much as possible
109+ - Test that the components of your code all work together (inner-package
110+ integration)
111+ - Test that your code works with its dependencies (dependency integration)
124112
125- The slowest, and most brittle, of all tests. Here, you set up an entire
126- production-like system, and run tests against it .
113+ These tests can be a good place for more extensive edge-case, and fuzzy input
114+ testing .
127115
128- - Create a Dev / Testing / Staging environment, and run tests against it to make
129- sure everything works together
130- - Fake user input, using tools like
131- [ Selenium] ( https://www.selenium.dev/documentation/ )
132- - Processing data from a pre-loaded test database
133- - Manual QA testing
116+ The intended audience for these tests developers working on the project, or
117+ debugging issues they encounter as opposed to Public Interface tests, which
118+ should be helpful for users of the package.
134119
135120## Unit Tests
136121
137- Internal tests, which test that individual units/components of the code behave
138- as expected in isolation. Some examples of units are: A single function, an
139- attribute of an object, a method or property of a class.
122+ Unit tests loosely follow the "London School" of testing, where the smallest
123+ unit of code is tested in isolation.
124+
125+ These tests are written from an internal perspective, so they are a good place
126+ to test aspects of the codebase which are "private" not directly exposed to
127+ users, but which still need to be tested. Some examples of units are: A single
128+ function, an attribute of an object, a method or property of a class.
140129
141130### Advantages of unit testing:
142131
@@ -362,12 +351,128 @@ def test_pytest(mocker):
362351 dangerous_sideffects()
363352```
364353
354+ ### A Brief Taxonomy Test Suites
355+
356+ A non-exhaustive discussion of some common types of tests.
357+
358+ ^_ ^ Dont Panic ^_ ^
359+
360+ Depending on your project, you may not need many, or most of these kinds of
361+ tests.
362+
363+ - A library project probably does not need to test integration with
364+ microservices.
365+ - A library with no 3rd party dependencies, does not need test them.
366+ - Fuzz testing is for critical code, that many users rely on.
367+
368+ #### Behavioral, Feature, or Functional Tests:
369+
370+ High-level tests, which ensure a specific feature works. Used for testing things
371+ like:
372+
373+ - Loading a file works
374+ - Setting a debug flag results in debug messages being printed
375+ - A configuration option affects the behavior of the code as expected
376+
377+ #### Fuzz Tests
378+
379+ Fuzz tests attempt to test the full range of possible inputs to a function. They
380+ are good for finding edge-cases, where what should be valid input causes a
381+ failure. [ Hypothesis] ( https://hypothesis.readthedocs.io/en/latest/ ) is an
382+ excellent tool for this, and a lot of fun to use.
383+
384+ - SLOW TESTS: fuzz tests can take a very long time to run, and should usually be
385+ placed in a test suite which is run separately from faster tests.
386+ [ see: fail fast] ( https://en.wikipedia.org/wiki/Fail-fast_system )
387+ - Reserve fuzz testing for the few critical functions, where it really matters.
388+
389+ #### Integration Tests
390+
391+ The word "Integration" is a bit overloaded, and can refer to many levels of
392+ interaction between your code, its dependencies, and external systems.
393+
394+ - Code level
395+ - Test the integration between your software and external / 3rd party
396+ dependencies.
397+ - Low-level testing of your code-base, where you run the code imported from
398+ dependencies without mocking it.
399+
400+ - Environment level
401+ - Testing that your software works in the environments you plan to run it in.
402+ - Running inside of a docker container
403+ - Using GPU's or other specialized hardware
404+ - Deploying it to cloud servers
405+
406+ - System level
407+ - Testing that it interacts with other software in a larger system.
408+ - Interactions with other services, on local or cloud-based platforms
409+ - Micro-service, Database, or API connections and interactions
410+
411+ #### End to End Tests
412+
413+ The slowest, and most brittle, of all tests. Here, you set up an entire
414+ production-like system, and run tests against it.
415+
416+ - Create a Dev / Testing / Staging environment, and run tests against it to make
417+ sure everything works together
418+ - Fake user input, using tools like
419+ [ Selenium] ( https://www.selenium.dev/documentation/ )
420+ - Processing data from a pre-loaded test database
421+ - Manual QA testing
422+
423+ ### Other Kinds of Internal Tests
424+
425+ The thing that distinguishes Internal tests is their perspective on the code,
426+ where External tests focus on the way users will interact with the package (or
427+ the public API) and "avoid testing implementation details". Internal tests exist
428+ to test that those critical implementation details work correctly.
429+
430+ #### Testing Edgecases
431+
432+ While writing unit tests, you may be tempted to test edgecases. You may have a
433+ critical private function or algorithm, which is not part of the public API, so
434+ not a good candidate for External tesing, and you are concerned about many
435+ edgecases that you want to defend against using tests.
436+
437+ It is perfectly valid to write extensive edgecase testing for private code, but
438+ these tests should be kept separate from the unit test suite. Extensive edgecase
439+ testing makes tests long, and difficult to read (tests are documentation). They
440+ can slow down execution, we want unit tests to run first, fast, and often.
441+
442+ - Place them in separate files from unit tests, to improve readability
443+ - [ mark them] ( https://docs.pytest.org/en/stable/how-to/mark.html ) so that they
444+ can be run as a separate test suite, after your unit test pass
445+
446+ #### Fuzz Tests and other slow tests
447+
448+ Testing random input, using tools like Hypothesis, is similar to testing edge
449+ cases, but running these tests can take a very long time, and they can often be
450+ much more complex and difficult to read for new developers.
451+
452+ - Place them in their own test files
453+ - [ mark them] ( https://docs.pytest.org/en/stable/how-to/mark.html ) so that they
454+ can be run as a separate test suite, once all of the faster test suites have
455+ succeeded.
456+
365457## Diagnostic Tests
366458
367459Diagnostic tests are used to verify the installation of a package. They should
368460be runable on production systems, like when we need to ssh into a live server to
369461troubleshoot problems.
370462
463+ A diagnostic test suite may contain any combination of tests you deem pertinent.
464+ You could include all the unit tests, or a specific subset of them. You may want
465+ to include some integration tests, and feature tests. Consider them Smoke Tests,
466+ a select sub-set of tests, meant to catch critical errors quickly, not perform a
467+ full system check of the package.
468+
469+ - Respect the user's environment!
470+ - Diagnostic tests should not require additional dependencies beyond what the
471+ package requires.
472+ - Do not create files, alter a database, or change the state of the system
473+ - Run quickly, select tests that can be run in a few moments
474+ - provide meaningful feedback
475+
371476### Advantages of Diagnostic Tests
372477
373478- Diagnostic tests allow us to verify an installation of a package.
0 commit comments