Description and Purpose
Executing and testing learners' code, and providing feedback and guidance based on the result, is one of the Codecademy platform's strengths. Testing learner code is also complicated and often involves careful forethought and the proper test choice for the job. A test is a piece of code that runs to assess the learners' input and provide helpful feedback if they get the wrong answer. They are intended to provide instant feedback so that the learner can adjust their approach to the problem with guidance.
Testing is primarily located in exercises within checkpoints. Every checkpoint needs a test, even if it is just
default pass. Both code execution and tests are triggered by the Run or Check Work button, but they happen in separate processes.
Code Challenge Assessment (CCA)
CCAs are intended as practice content, so in a CCA, tests are used to verify that a learner can correctly implement a prompt. CCAs can only contain a single test file. Unlike lessons, learners can run code and execute tests separately. The intended behavior is for learners to read the prompt, try an implementation and test it themselves to verify, and then check their answer when they're confident that they have implemented it correctly.
In rare cases, test suits have been provided to learners in projects, both on-platform projects and off-platform projects. In these cases, learners manually run test suites that are usually intended to test output/final code behavior (more like true unit tests) than the often implementation-specific tests used in lessons. Tests in projects are intended to provide learners a way to verify that their solution meets our intended behavior.
Tests can be broadly divided into two categories, those that run in a learner's browser (browser tests), and those that run in the learner's workspace container (container tests) in Codecademy's code evaluation infrastructure:
- Browser: These tests come in two categories, Component tests or WebSCT. Component tests can assert something about one of the learning environment components such as the code editor. Frequently this is used to compare code with a regular expression to match learner code with expected results. WebSCT tests have access to the
- Container: These types of tests happen in the back-end infrastructure of Codecademy. They are most-often language specific, and they usually use open-source unit-testing libraries and test runners (such as JUnit or Mocha) to test actual code behavior and output.
|Test Type||Browser or Container||Language|
|Bats (Active)||Container||Any (mostly used for SQL output and bash commands)|
|Bats (Passive)||Container||Bash Mostly|
|Codex SCT||Container||Legacy for Python 2 and Ruby|
|Default Pass (No Test)||Container||None|
|Execute File||Container||Any (but rarely used)|
|Execute Test||Container||Any (but rarely used)|
|Python Unit||Container||Python 3|
- Potential runtime errors the learner's code may raise when being tested should be handled by the test and given human-readable responses. (i.e. when testing values of variables, the test should first check if the appropriate variables are defined and return an error message.)
- The test should first check if the entire exercise (not just this checkpoint) has been completed, and then pass if it has. This way, if the user finishes the exercise before running the code, they do not fail all of the checkpoints leading up to the last checkpoint.
- Test Defensively. A common test-writing error is to make assumptions about the state of a learner's workspace when they submit code for testing. For example, if you wanted to test a variable value in learner code, you should always test first for the existence of that identifier before testing assertions about its value. Testing defensively helps to provide more directional feedback to learners (i.e. "First define the x variable.") and can help avoid runtime test execution errors.
For a checkpoint where a user is defining a function, the test should contain test cases that test both the normal operation of the function, and reasonable edge cases of the function.
Tests should pass valid code regardless of implementation, unless implementation is the thing being tested.
Debugging tests can expose some confusing Learning Environment behavior as an author. When running tests, make sure that:
- You have "Allow All Navigation" disabled
- You reset lesson progress to that exercise (
S) (Code can be run, but tests will not execute if you have completed an exercise)
- Test feedback tone and voice should be consistent across a lesson.
- Format error messages correctly. When passing error messages, use backticks to format code for expected/actual values and code references.
- Write well self-documenting tests that can be maintained by future curriculum developers.
- Ex.: Use descriptive variable names like
given_answerso that if someone is trying to maintain the test, they can see why failing cases occur.
- Ex.: Comment each unclear failing case or non-default passing case with why the test is checking for this case. E.g "testing for the proper handling of negative numbers"
- Ex.: Use descriptive variable names like
- Test feedback should guide the learner towards the correct answer (e.g. not simply, "this is wrong.")
- Focus on behavior instead of implementation Whenever possible, it is best to focus on testing behavior of code (expected output of a function given a particular input, for example) as opposed to testing restrictive implementations. There are myriad ways to implement functionality (just think about if/else branches vs switch statements), and unless we are trying to teach a specific new concept, it's always best to be flexible.
- If a value or output is being checked, strive to have the test feedback communicate the actual output and the expected output. E.g. "Expected
num_cupcakesto be 25 but found 12."
- Test should anticipate common learner misconceptions and errors. E.g. if a learner is supposed to plot x vs y, a common error could be to plot y vs x. The test should look for this specifically and produce an error message like "Inputs to the plot function are in the wrong order"
- Anticipate Learner Misunderstanding. When testing, try to anticipate common errors that a learner might encounter and provide appropriate, guiding feedback.
- Handle errors when running learner code Whenever possible, don't assume that learner code will be free of syntax/runtime errors when using container tests. Try to gracefully handle errors and provide feedback. Chances are, if learners have an error, they'll see it in the output terminal, and you don't want to overwhelm them with multiple (potentially confusing and differently-formatted) runtime/compiler errors.
- Avoid test feedback that provides learners passing code.
Here are several examples of good tests.
- Linear Regression Put It Together - checks for the final passing code, and then for the individual checkpoint.
- Lists Code Challenge - has several examples of multiple test cases.
- Matplotlib Basic Line Plot - checks for common mistake of wrong order of plotting
- Function Declarations - Mocha test with Structured.js (for testing syntax structure) and Rewire.js (for importing variables)
- componentDidMount - Mocha test with Enzyme.js (for testing React components) and Sinon (for mocking)
- How a Form Works - using WebSCT
- CSS Position Code Challenge - using WebSCT