Yet Another Cycling Forum

General Category => The Knowledge => Ctrl-Alt-Del => Topic started by: aidan.f on 12 March, 2018, 01:24:49 pm

Title: A little python help please
Post by: aidan.f on 12 March, 2018, 01:24:49 pm: I'm writing a simple program to interrogate an measuring instrument using PyVisa. If for example I send this command from a terminal console:
Quote
>>> MS2712E.write(':MMEMory:MSIS USB')
I get echoed back:
Quote
(32, <StatusCode.success: 0>)
What I cannot understand is how to capture this a variable within my program code, I wish to wait for the response and then continue - with a timeout and error handler. Yes - I have searched the documentation!
Title: Re: A little python help please
Post by: Greenbank on 12 March, 2018, 05:48:46 pm: Is the response not just value returned by the write call? (Ah no, it will return the number of bytes sent by the write() call.)

Looking at the documentation it seems that it shouldn't be, and that a write() can be followed by a read() to get a response, and these can be combined as a query, i.e.

resp = MS2712E.query(':MMEMory:MSIS USB')

What do you get if you do the following in the REPL:-

resp = MS2712E.write(':MMEMory:MSIS USB')
print resp
resp = MS2712E.read()
print resp

...and starting again...

resp = MS2712E.query(':MMEMory:MSIS USB')
print resp

?

Timeouts are handled by a separate variable: https://pyvisa.readthedocs.io/en/stable/resources.html#timeout
Title: Re: A little python help please
Post by: aidan.f on 13 March, 2018, 08:21:07 am: Greenbank,

Thank you for a helpful reply
Your first example returns the string, seems obvious now.

This is the first four of the returned 400 csv screen levels in -dbm ..why, oh why, to six DP's?
Quote
#46393-113.124001,-113.332001,-111.680000,-111.568001

- having added the missing ',' to ':#46393,-113.1240...' I am considering how to parse out max and min values.
Title: Re: A little python help please
Post by: philip on 13 March, 2018, 01:50:00 pm: If the data is a string you could start by making it into a list of float:

Code: [Select]
str="#46393,-113.124001,-113.332001,-111.680000,-111.568001" nms=map(float, str.split(",")[1:]) print min(nms) print max(nms)
What is the #46393 anyway? Is it the length of the data?
Title: Re: Moar python help please
Post by: mrcharly-YHT on 13 March, 2018, 08:14:15 pm: If I have this:
str = " <a href=\"documentation - Copy.html\">documentationCopy.html</a>"
I can match it with this:
pat = re.compile(r"^[\w*\s\n<>.\"/_=-]+$", re.IGNORECASE)
using
pat.search(str)

However if I put the same string into a file and read the file using
with open(docfile, 'rt') as doc_in: # open file for reading
docList = doc_in.read()

Then try
pat.search(docList)

I don't get a match.

Any clue why?
Title: Re: A little python help please
Post by: philip on 13 March, 2018, 08:34:21 pm: Your string literal has escaped quotes -- the two instances of the two character sequence \" in the href. These convert to a single " character in the value of the string. If your file also contains the two character sequences these will not be converted to a single character when the data is read from the file. That means the value read from the file will contain \ characters and so will be different from the value converted from the string literal.
Title: Re: A little python help please
Post by: mrcharly-YHT on 13 March, 2018, 08:59:08 pm: Hmm - good point.

The file actually contains this:
 <a href="documentation - Copy.html">documentationCopy.html</a>

along with many other lines

dammit

I just tried it with a stripped version of the file, removing all the other lines. It matches then

So the match fails because of the the preceding lines.
Title: Re: A little python help please
Post by: philip on 13 March, 2018, 09:07:33 pm: Your re starts with ^ which explicitly requires the match to be at the start of the data. Perhaps you want MULTILINE mode?
Title: Re: A little python help please
Post by: ian on 13 March, 2018, 09:19:56 pm: You have finally created a topic I can't post something stupid in. Congratulations.
Title: Re: A little python help please
Post by: mrcharly-YHT on 13 March, 2018, 09:28:39 pm: I may have tried to do something wrong. In googling this, I found a post on stackoverflow
Quote
Every time you attempt to parse HTML with regular expressions, the unholy child weeps the blood of virgins, and Russian hackers pwn your webapp. Parsing HTML with regex summons tainted souls into the realm of the living. HTML and regex go together like love, marriage, and ritual infanticide.
https://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags (https://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags)

However, I have summoned the ungodly, used re.M and parsed html.
Title: Re: A little python help please
Post by: Greenbank on 14 March, 2018, 08:07:44 am: The file will contain a carriage return (and possibly a line-feed) at the end of the line but your regex does not expect this (it expects the end of the string, i.e. no more characters, after the last > character).

Either:-
a) Strip off the carriage return so that your regex matches (it will also fail if there are multiple lines as the [\w*\s\n<>.\"/_=-] regex will not match intermediate CR/LF), i.e. rstrip() or rstrip( '\n' )
b) use multiline mode to mask the problem
c) Modify the regex to expect a carriage return at the end, i.e. something like this (untested):-

Code: [Select]
pat = re.compile(r"^[\w*\s\n<>.\"/_=-]+[\r\n]*$", re.IGNORECASE)
Regex can be used for simple parsing of HTML but if the parsing requirements start to grow it quickly escalates into insanity. Then you want to move to a proper parser such as html.parser (https://docs.python.org/3/library/html.parser.html)
Title: Re: A little python help please
Post by: mrcharly-YHT on 14 March, 2018, 08:24:36 am: Using multiline worked.

I ran out of time to do the next bit, which is to replace the matched string and then write out the result to the original file.
Title: Re: A little python help please
Post by: Oaky on 14 March, 2018, 09:45:25 am: Quote from: ian on 13 March, 2018, 09:19:56 pm
You have finally created a topic I can't post something stupid in. Congratulations.

No small serpents in Surrey?
Title: Re: A little python help please
Post by: ian on 14 March, 2018, 09:51:46 am: Bad Cat used to bring in slow worms but it's not happened recently so I assume they've got faster.
Title: Re: A little python help please
Post by: aidan.f on 14 March, 2018, 10:53:24 pm: Wow! wot have i started!. :-). ATM no coding as out on site in Kent (bluebell hill) for a couple of days. Then I may well return to this topic to ask something stupid
Title: Re: A little python help please
Post by: Chris S on 14 March, 2018, 11:20:46 pm: Quote from: aidan.f on 14 March, 2018, 10:53:24 pm
Wow! wot have i started!. :-). ATM no coding as out on site in Kent (bluebell hill) for a couple of days. Then I may well return to this topic to ask something stupid

Did you take a bike? Some of the cycling is quite steep round there; up and down the North Downs escarpment is good for the soul.
Title: Re: A little python help please
Post by: mrcharly-YHT on 15 March, 2018, 09:10:14 am: lol

Last night I realised I was probably doing things all wrong so needed a different approach. Maybe I should just buy a decent book on Python instead of googling (seriously, can't sites just give syntax?).

There seem to be 9million different ways you could possibly achieve any given goal and the lack of explicit typing does my head in. End up with errors because I don't know what type something is, and end up using incompatible types. Well that would be solved if, erm types had to be declared and it was all, you know, explicit instead of this hacky crap.
Title: Re: A little python help please
Post by: David Martin on 15 March, 2018, 08:32:33 pm: Python is both fantastic and the worst thing ever, often at the same time.
I've just downloaded some code in 2.7 but I run 3.5 so will be seeing if it runs.
Title: Re: A little python help please
Post by: freeflow on 16 March, 2018, 06:06:04 pm: A recent introduction to Python 3 (as of 3.5) is type hinting

https://www.python.org/dev/peps/pep-0484/ (https://www.python.org/dev/peps/pep-0484/)
Title: Re: Moar python help please
Post by: vorsprung on 17 March, 2018, 01:03:38 pm: Quote from: mrcharly-YHT on 13 March, 2018, 08:14:15 pm
If I have this:
str = " <a href=\"documentation - Copy.html\">documentationCopy.html</a>"
I can match it with this:
pat = re.compile(r"^[\w*\s\n<>.\"/_=-]+$", re.IGNORECASE)
using
pat.search(str)

However if I put the same string into a file and read the file using
with open(docfile, 'rt') as doc_in: # open file for reading
docList = doc_in.read()

Then try
pat.search(docList)

I don't get a match.

Any clue why?

because you are trying to process html with regexp and not with a parser like BeautifulSoup
Title: Re: A little python help please
Post by: mrcharly-YHT on 17 March, 2018, 01:14:38 pm: I actually managed to get that working - but it was a blind alley anyway. Will return to this later.

All I want to do is parse an HTML file for two markers, then replace the markers and any text between them with a string that I've created (that starts and ends with the markers).
Title: Re: A little python help please
Post by: Pickled Onion on 18 March, 2018, 07:06:14 am: Quote from: mrcharly-YHT on 17 March, 2018, 01:14:38 pm
I actually managed to get that working - but it was a blind alley anyway. Will return to this later.

All I want to do is parse an HTML file for two markers, then replace the markers and any text between them with a string that I've created (that starts and ends with the markers).

Aha! That's not parsing, that's just matching, so you won't be raising the undead and feeding your firstborn to them. Regular expressions will match the *bit* of the string you want them to match, so you don't need the (start|end)-of-(string|line) markers, or the *s to match outside what you actually want to match.

A couple of points. Regular expressions match "longest leftmost", so if there are multiple sets of marker pairs you will need to handle that differently to the case where there is only one. I'm not sure about your "character class" in the middle, do you really not want to match when there's a digit between your markers? If you want to match everything then say so (.+)

One of the online regex debuggers is a useful tool to try things out and see why they're not working.
Title: Re: A little python help please
Post by: mrcharly-YHT on 18 March, 2018, 03:50:42 pm: Quote from: Pickled Onion on 18 March, 2018, 07:06:14 am
A couple of points. Regular expressions match "longest leftmost", so if there are multiple sets of marker pairs you will need to handle that differently to the case where there is only one. I'm not sure about your "character class" in the middle, do you really not want to match when there's a digit between your markers? If you want to match everything then say so (.+)
I'm not sure what you mean. That expression successfully matches on strings that contain integer characters. The \w matches on alphanumerics.
Title: Re: A little python help please
Post by: Mr Larrington on 18 March, 2018, 08:51:03 pm: Just use FORTRAN.

(Runs away)
Title: Re: A little python help please
Post by: mrcharly-YHT on 18 March, 2018, 09:13:31 pm: I actually did use FORTAN 77 at uni and would probably find it easier to follow.

In the end this turned out to be incredibly easy, once I ignored all the shite on stackoverflow.
Title: Re: A little python help please
Post by: perpetual dan on 18 March, 2018, 10:35:21 pm: > There seem to be 9million different ways you could possibly achieve any given goal
That's a feature. :P

I'll +1 the use of Beautiful Soup, unless your goal is to learn how to use regexes.
Title: Re: A little python help please
Post by: Pickled Onion on 19 March, 2018, 08:12:22 am: Quote from: mrcharly-YHT on 18 March, 2018, 03:50:42 pm
Quote from: Pickled Onion on 18 March, 2018, 07:06:14 am
A couple of points. Regular expressions match "longest leftmost", so if there are multiple sets of marker pairs you will need to handle that differently to the case where there is only one. I'm not sure about your "character class" in the middle, do you really not want to match when there's a digit between your markers? If you want to match everything then say so (.+)
I'm not sure what you mean. That expression successfully matches on strings that contain integer characters. The \w matches on alphanumerics.

Oops - yes you're right, I didn't realise that \w (word characters) includes digits and underscore. I have learnt something (and I should have taken my own advice to use a regex checker!)

What I was trying to say is, do you really want to fail to match your docstart/docend when there are any other characters not in your list, like a comma, question mark, single quote, etc?
Title: Re: A little python help please
Post by: mrcharly-YHT on 19 March, 2018, 08:29:39 am: Quote from: Pickled Onion on 19 March, 2018, 08:12:22 am
Quote from: mrcharly-YHT on 18 March, 2018, 03:50:42 pm
Quote from: Pickled Onion on 18 March, 2018, 07:06:14 am
A couple of points. Regular expressions match "longest leftmost", so if there are multiple sets of marker pairs you will need to handle that differently to the case where there is only one. I'm not sure about your "character class" in the middle, do you really not want to match when there's a digit between your markers? If you want to match everything then say so (.+)
I'm not sure what you mean. That expression successfully matches on strings that contain integer characters. The \w matches on alphanumerics.

Oops - yes you're right, I didn't realise that \w (word characters) includes digits and underscore. I have learnt something (and I should have taken my own advice to use a regex checker!)

What I was trying to say is, do you really want to fail to match your docstart/docend when there are any other characters not in your list, like a comma, question mark, single quote, etc?
There shouldn't be.
The bit between the docstart/docend is a list of files, generated using glob.glob("*.*")
Title: Re: A little python help please
Post by: andrew_s on 19 March, 2018, 09:39:03 am: It depends where the file names came from.
If there's been scope for people typing them in, there could be any characters, including non-ascii (66/99 quotes, em-dashes etc).
Title: Re: A little python help please
Post by: mrcharly-YHT on 19 March, 2018, 10:25:17 am: If there is any of that crap in these filenames, I don't want them listing!

These are files for distribution in a release package, so the filenames are in a controlled format.

This pattern turned out to work. The previous one didn't work on Linux

Code: [Select]
pat = re.compile(r"[\n\w\s<>.\"/_=-]*$", re.IGNORECASE|re.M)
Title: Re: A little python help please
Post by: Greenbank on 19 March, 2018, 12:26:20 pm: People can only go on what you've given them and with little context (examples, etc) they're going to question some of the decisions made, i.e. the character class:-

Code: [Select]
pat = re.compile(r"[\n\w\s<>.\"/_=-]*$", re.IGNORECASE|re.M)
as most people would just expect to do:-

Code: [Select]
pat = re.compile(r".+$", re.IGNORECASE|re.M)
It all depends on information they don't have, i.e. are any lines likely to contain extra docstart or docend tags, are some lines to be rejected because of invalid/unwanted data, etc.

Without that context a character class such as [\n\w\s<>.\"/_=-] is a red flag to me, especially with the \n in there (before the regex was in multi-line mode). Another red flag is the lack of a start of string anchor given the presence of an end of string anchor - both or neither in general (but, again, I don't know the full context).

If it's just a set of lines that need converting and the list will be checked afterwards then there's little point over-engineering a one off task. If it works, move on. If it's something that will be run again and again (as part of a release) then I'd expect something a lot more defensively minded.
Title: Re: A little python help please
Post by: mrcharly-YHT on 19 March, 2018, 12:54:05 pm: A python programmer here had a look and suggested using .*

It didn't match, to his surprise. We tried a few versions of that. Only the pattern I've put up matches every time.
Title: Re: A little python help please
Post by: vorsprung on 24 March, 2018, 09:29:56 pm: Quote from: mrcharly-YHT on 19 March, 2018, 12:54:05 pm
A python programmer here had a look and suggested using .*

It didn't match, to his surprise. We tried a few versions of that. Only the pattern I've put up matches every time.

unless you are in re.S mode . doesn't match \n, tell your programmer that