Yet Another Cycling Forum
General Category => The Knowledge => Ctrl-Alt-Del => Topic started by: aidan.f on 12 March, 2018, 01:24:49 pm
-
I'm writing a simple program to interrogate an measuring instrument using PyVisa. If for example I send this command from a terminal console:
>>> MS2712E.write(':MMEMory:MSIS USB')
I get echoed back:
(32, <StatusCode.success: 0>)
What I cannot understand is how to capture this a variable within my program code, I wish to wait for the response and then continue - with a timeout and error handler. Yes - I have searched the documentation!
-
Is the response not just value returned by the write call? (Ah no, it will return the number of bytes sent by the write() call.)
Looking at the documentation it seems that it shouldn't be, and that a write() can be followed by a read() to get a response, and these can be combined as a query, i.e.
resp = MS2712E.query(':MMEMory:MSIS USB')
What do you get if you do the following in the REPL:-
resp = MS2712E.write(':MMEMory:MSIS USB')
print resp
resp = MS2712E.read()
print resp
...and starting again...
resp = MS2712E.query(':MMEMory:MSIS USB')
print resp
?
Timeouts are handled by a separate variable: https://pyvisa.readthedocs.io/en/stable/resources.html#timeout
-
Greenbank,
Thank you for a helpful reply
Your first example returns the string, seems obvious now.
This is the first four of the returned 400 csv screen levels in -dbm ..why, oh why, to six DP's?
#46393-113.124001,-113.332001,-111.680000,-111.568001
- having added the missing ',' to ':#46393,-113.1240...' I am considering how to parse out max and min values.
-
If the data is a string you could start by making it into a list of float:
str="#46393,-113.124001,-113.332001,-111.680000,-111.568001"
nms=map(float, str.split(",")[1:])
print min(nms)
print max(nms)
What is the #46393 anyway? Is it the length of the data?
-
If I have this:
str = "<!--docstart --> <p><a href=\"documentation - Copy.html\">documentationCopy.html</a></p><!--docend -->"
I can match it with this:
pat = re.compile(r"^<!--docstart -->[\w*\s\n<>.\"/_=-]+<!--docend -->$", re.IGNORECASE)
using
pat.search(str)
However if I put the same string into a file and read the file using
with open(docfile, 'rt') as doc_in: # open file for reading
docList = doc_in.read()
Then try
pat.search(docList)
I don't get a match.
Any clue why?
-
Your string literal has escaped quotes -- the two instances of the two character sequence \" in the href. These convert to a single " character in the value of the string. If your file also contains the two character sequences these will not be converted to a single character when the data is read from the file. That means the value read from the file will contain \ characters and so will be different from the value converted from the string literal.
-
Hmm - good point.
The file actually contains this:
<!--docstart --> <p><a href="documentation - Copy.html">documentationCopy.html</a></p><!--docend -->
along with many other lines
dammit
I just tried it with a stripped version of the file, removing all the other lines. It matches then
So the match fails because of the the preceding lines.
-
Your re starts with ^ which explicitly requires the match to be at the start of the data. Perhaps you want MULTILINE mode?
-
You have finally created a topic I can't post something stupid in. Congratulations.
-
I may have tried to do something wrong. In googling this, I found a post on stackoverflow
Every time you attempt to parse HTML with regular expressions, the unholy child weeps the blood of virgins, and Russian hackers pwn your webapp. Parsing HTML with regex summons tainted souls into the realm of the living. HTML and regex go together like love, marriage, and ritual infanticide.
https://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags (https://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags)
However, I have summoned the ungodly, used re.M and parsed html.
-
The file will contain a carriage return (and possibly a line-feed) at the end of the line but your regex does not expect this (it expects the end of the string, i.e. no more characters, after the last > character).
Either:-
a) Strip off the carriage return so that your regex matches (it will also fail if there are multiple lines as the [\w*\s\n<>.\"/_=-] regex will not match intermediate CR/LF), i.e. rstrip() or rstrip( '\n' )
b) use multiline mode to mask the problem
c) Modify the regex to expect a carriage return at the end, i.e. something like this (untested):-
pat = re.compile(r"^<!--docstart -->[\w*\s\n<>.\"/_=-]+<!--docend -->[\r\n]*$", re.IGNORECASE)
Regex can be used for simple parsing of HTML but if the parsing requirements start to grow it quickly escalates into insanity. Then you want to move to a proper parser such as html.parser (https://docs.python.org/3/library/html.parser.html)
-
Using multiline worked.
I ran out of time to do the next bit, which is to replace the matched string and then write out the result to the original file.
-
You have finally created a topic I can't post something stupid in. Congratulations.
No small serpents in Surrey?
-
Bad Cat used to bring in slow worms but it's not happened recently so I assume they've got faster.
-
Wow! wot have i started!. :-). ATM no coding as out on site in Kent (bluebell hill) for a couple of days. Then I may well return to this topic to ask something stupid
-
Wow! wot have i started!. :-). ATM no coding as out on site in Kent (bluebell hill) for a couple of days. Then I may well return to this topic to ask something stupid
Did you take a bike? Some of the cycling is quite steep round there; up and down the North Downs escarpment is good for the soul.
-
lol
Last night I realised I was probably doing things all wrong so needed a different approach. Maybe I should just buy a decent book on Python instead of googling (seriously, can't sites just give syntax?).
There seem to be 9million different ways you could possibly achieve any given goal and the lack of explicit typing does my head in. End up with errors because I don't know what type something is, and end up using incompatible types. Well that would be solved if, erm types had to be declared and it was all, you know, explicit instead of this hacky crap.
-
Python is both fantastic and the worst thing ever, often at the same time.
I've just downloaded some code in 2.7 but I run 3.5 so will be seeing if it runs.
-
A recent introduction to Python 3 (as of 3.5) is type hinting
https://www.python.org/dev/peps/pep-0484/ (https://www.python.org/dev/peps/pep-0484/)
-
If I have this:
str = "<!--docstart --> <p><a href=\"documentation - Copy.html\">documentationCopy.html</a></p><!--docend -->"
I can match it with this:
pat = re.compile(r"^<!--docstart -->[\w*\s\n<>.\"/_=-]+<!--docend -->$", re.IGNORECASE)
using
pat.search(str)
However if I put the same string into a file and read the file using
with open(docfile, 'rt') as doc_in: # open file for reading
docList = doc_in.read()
Then try
pat.search(docList)
I don't get a match.
Any clue why?
because you are trying to process html with regexp and not with a parser like BeautifulSoup
-
I actually managed to get that working - but it was a blind alley anyway. Will return to this later.
All I want to do is parse an HTML file for two markers, then replace the markers and any text between them with a string that I've created (that starts and ends with the markers).
-
I actually managed to get that working - but it was a blind alley anyway. Will return to this later.
All I want to do is parse an HTML file for two markers, then replace the markers and any text between them with a string that I've created (that starts and ends with the markers).
Aha! That's not parsing, that's just matching, so you won't be raising the undead and feeding your firstborn to them. Regular expressions will match the *bit* of the string you want them to match, so you don't need the (start|end)-of-(string|line) markers, or the *s to match outside what you actually want to match.
A couple of points. Regular expressions match "longest leftmost", so if there are multiple sets of marker pairs you will need to handle that differently to the case where there is only one. I'm not sure about your "character class" in the middle, do you really not want to match when there's a digit between your markers? If you want to match everything then say so (.+)
One of the online regex debuggers is a useful tool to try things out and see why they're not working.
-
A couple of points. Regular expressions match "longest leftmost", so if there are multiple sets of marker pairs you will need to handle that differently to the case where there is only one. I'm not sure about your "character class" in the middle, do you really not want to match when there's a digit between your markers? If you want to match everything then say so (.+)
I'm not sure what you mean. That expression successfully matches on strings that contain integer characters. The \w matches on alphanumerics.
-
Just use FORTRAN.
(Runs away)
-
I actually did use FORTAN 77 at uni and would probably find it easier to follow.
In the end this turned out to be incredibly easy, once I ignored all the shite on stackoverflow.
-
> There seem to be 9million different ways you could possibly achieve any given goal
That's a feature. :P
I'll +1 the use of Beautiful Soup, unless your goal is to learn how to use regexes.
-
A couple of points. Regular expressions match "longest leftmost", so if there are multiple sets of marker pairs you will need to handle that differently to the case where there is only one. I'm not sure about your "character class" in the middle, do you really not want to match when there's a digit between your markers? If you want to match everything then say so (.+)
I'm not sure what you mean. That expression successfully matches on strings that contain integer characters. The \w matches on alphanumerics.
Oops - yes you're right, I didn't realise that \w (word characters) includes digits and underscore. I have learnt something (and I should have taken my own advice to use a regex checker!)
What I was trying to say is, do you really want to fail to match your docstart/docend when there are any other characters not in your list, like a comma, question mark, single quote, etc?
-
A couple of points. Regular expressions match "longest leftmost", so if there are multiple sets of marker pairs you will need to handle that differently to the case where there is only one. I'm not sure about your "character class" in the middle, do you really not want to match when there's a digit between your markers? If you want to match everything then say so (.+)
I'm not sure what you mean. That expression successfully matches on strings that contain integer characters. The \w matches on alphanumerics.
Oops - yes you're right, I didn't realise that \w (word characters) includes digits and underscore. I have learnt something (and I should have taken my own advice to use a regex checker!)
What I was trying to say is, do you really want to fail to match your docstart/docend when there are any other characters not in your list, like a comma, question mark, single quote, etc?
There shouldn't be.
The bit between the docstart/docend is a list of files, generated using glob.glob("*.*")
-
It depends where the file names came from.
If there's been scope for people typing them in, there could be any characters, including non-ascii (66/99 quotes, em-dashes etc).
-
If there is any of that crap in these filenames, I don't want them listing!
These are files for distribution in a release package, so the filenames are in a controlled format.
This pattern turned out to work. The previous one didn't work on Linux
pat = re.compile(r"<!--docstart -->[\n\w\s<>.\"/_=-]*<!--docend -->$", re.IGNORECASE|re.M)
-
People can only go on what you've given them and with little context (examples, etc) they're going to question some of the decisions made, i.e. the character class:-
pat = re.compile(r"<!--docstart -->[\n\w\s<>.\"/_=-]*<!--docend -->$", re.IGNORECASE|re.M)
as most people would just expect to do:-
pat = re.compile(r"<!--docstart -->.+<!--docend -->$", re.IGNORECASE|re.M)
It all depends on information they don't have, i.e. are any lines likely to contain extra docstart or docend tags, are some lines to be rejected because of invalid/unwanted data, etc.
Without that context a character class such as [\n\w\s<>.\"/_=-] is a red flag to me, especially with the \n in there (before the regex was in multi-line mode). Another red flag is the lack of a start of string anchor given the presence of an end of string anchor - both or neither in general (but, again, I don't know the full context).
If it's just a set of lines that need converting and the list will be checked afterwards then there's little point over-engineering a one off task. If it works, move on. If it's something that will be run again and again (as part of a release) then I'd expect something a lot more defensively minded.
-
A python programmer here had a look and suggested using .*
It didn't match, to his surprise. We tried a few versions of that. Only the pattern I've put up matches every time.
-
A python programmer here had a look and suggested using .*
It didn't match, to his surprise. We tried a few versions of that. Only the pattern I've put up matches every time.
unless you are in re.S mode . doesn't match \n, tell your programmer that