Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Any known python bindings? #178

Open
ghost opened this issue Feb 13, 2019 · 4 comments
Open

Any known python bindings? #178

ghost opened this issue Feb 13, 2019 · 4 comments
Labels

Comments

@ghost
Copy link

ghost commented Feb 13, 2019

Hi!
I was wondering if there are any known python bindings or api's for the same validator? I would be a great effort if there is any.

Thanks

@DavidUnderdown
Copy link

DavidUnderdown commented Feb 13, 2019

I've got a very simplistic script (not currently public) that launches the Scala validator as a subprocess from Python 3, checks the return code and writes out the validator output. Inputs and outputs mostly hard-coded at the moment so it would need a little work to turn into a proper module which would be a nicer approach. I'm certainly not aware of any pure Python implementation or more sophisticated API being available though.

The calling module was something I vaguely had in mind as a task for a kind of Python code club we've got running within the organisation at the moment, once we're a bit further on.

@ghost
Copy link
Author

ghost commented Feb 13, 2019

yeah i have similar usage of it too as a subprocess. But you know using subprocess is kind of hard approach to me.

@DavidUnderdown
Copy link

DavidUnderdown commented Feb 13, 2019

Probably the most important thing is understanding the return codes. I have

try :
	csvvalidator.check_returncode();
except subprocess.CalledProcessError as err:
	if csvvalidator.returncode == 3 :
		#this is not unexpected (a data validation issue), flag to user and continue
		print(batch,"has CSV validation errors reported see",reportFile,"for details");
	elif csvvalidator.returncode == 2 :
		#something wrong with the schema file, all validation uses the same schema, so stop run, raising the error
		print("CSV schema parsing error:",csvvalidator.stdout)
		raise err;
except subprocess.SubprocessError as suberr:
	## some completely unexpected error has occurred with the subprocess
	raise suberr;
## write out the report file and if no error confirm pass/pass with warnings for the batch;
if csvvalidator.returncode == 0 :
	if csvvalidator.stdout == "PASS\n" :
		print(batch,"passed CSV validation");
	else :
		print(batch,"passed CSV validation with warnings see",reportFile,"for details");

@afranke
Copy link

afranke commented Feb 18, 2019

Thanks @DavidUnderdown, I’m on the subprocess train as well but my code was not as sophisticated as yours. A native implementation would indeed be nice, not only because of the more idiomatic use or for performance reasons, but also because having more than one implementation around is always a good thing for something that tries to establish itself as a standard.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants