-
-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support control characters in @CsvSource
and @CsvFileSource
#3824
Comments
@CsvSource
result in null
input values
Hi @xazap, Thanks for raising the issue. I edited your description to clarify quoting vs. escaping. In addition, I confirmed the behavior you have reported. This may be an issue with the CSV parsing library that we use to support In any case, we'll investigate what our options are. |
That's indeed the case. It looks like the Univocity CSV parser ignores control characters by default. When I add the following to our settings.setSkipBitsAsWhitespace(false); However, the latter two invocations of I assumed the |
OK, after a bit more experimentation, I got your settings.getFormat().setCharToEscapeQuoteEscaping('\\');
settings.setSkipBitsAsWhitespace(false); Although these changes in the settings do not cause any of the tests in our test suite to fail, I'm a bit hesitant to change them for all users. We may wish to introduce attributes in In light of that, we'll discuss this topic during one of our upcoming team calls. |
@CsvSource
result in null
input values@CsvSource
result in null
or empty input values
Please note that control characters are ignored in your Thus, the following passes without any modifications to JUnit Jupiter, since @ParameterizedTest
@CsvSource(delimiterString = "||", textBlock = """
A || 1
C\u0000D || 3
""")
void testWithUnquotedInput(String testcase, Integer expectedLength) {
assertNotNull(testcase);
assertEquals(expectedLength, testcase.length());
} Similarly, the following also passes without any modifications to JUnit Jupiter by setting @ParameterizedTest
@CsvSource(ignoreLeadingAndTrailingWhitespace = false, textBlock = """
A,1
\u0000,1
B\u0000,2
""")
void testWithUnquotedInput(String testcase, Integer expectedLength) {
assertNotNull(testcase);
assertEquals(expectedLength, testcase.length());
} |
@CsvSource
result in null
or empty input values@CsvSource
and @CsvFileSource
Thank you for explaining! For me it's confusing if
Ah, this works around the issue, but makes test less readable because I have to deliberately insert characters I don't want to test for.
This works! The formatting is not as I would like, but it is an acceptable workaround. I am confused about the meaning of Also, I noticed something odd: if I move the closing @ParameterizedTest
@CsvSource(ignoreLeadingAndTrailingWhitespace = false, textBlock = """
A,1
\u0000,1
B\u0000,2
""")
void testWithUnquotedInput(String testcase, Integer expectedLength) {
assertNotNull(testcase);
assertEquals(expectedLength, testcase.length());
} Why would one less trailing whitespace character matter in this case? |
You're welcome!
I understand how that can be confusing. To be honest, I was not aware of the difference with the Univocity parser's default behavior, and I doubt anyone else on the JUnit team was aware of that either. If I understood the documentation correctly, the difference is due to the fact that some databases include control characters in their exported CSV files which are typically ignored when importing or working with those CSV files.
As I mentioned above, we were unaware of that difference.
Yes, we can definitely update the Javadoc to make that explicit. However, I'd first like to discuss these topics within the team before committing to anything concrete.
I was not suggesting that you use that as a workaround. Rather, I was merely pointing out how things work with the default CSV parser settings.
I'm glad to hear that's a suitable workaround for you. 👍
There is no whitespace before
If you move the closing This is simply how text blocks in Java work. The documentation in the User Guide for
And:
I suggest you read that link which points to the Programmer's Guide to Text Blocks. Hopefully that clarifies things! |
Cheers, there is a lot more to text blocks than I knew! It makes perfect sense now. |
Description
I am writing unit tests where test cases have input strings with (non-printable) control characters. These characters generally occupy code points U+0000 through U+001F. When using
@CsvSource
I am finding that using control character literals in strings behaves differently from printable characters.For example:
\u0000
literal is translated tonull
.\u0000
literal is translated to an empty string""
.This behavior is observed with both Eclipse's internal JUnit 5 test runner and with Maven's Surefire plugin. I have considered the possible impact of the
nullValues
parameter of@CsvSource
. This attribute defaults to{}
, so translation tonull
or an empty string is therefore not expected.Steps to reproduce
The test below should pass, but unexpectedly fails for
@CsvSource
test cases that have\u0000
literals.Context
Used versions
Build Tool/IDE
The text was updated successfully, but these errors were encountered: