No se puede guardar la cadena en la variable

I work with HP uCMDB to extract data from servers. In my python script I have this:

iostat_cmd = client.executeCmd('iostat -En '+disk+'|egrep \'Vendor|Size\'')

Which executes iostat and returns this:

-bash-3.2$ iostat -En|egrep "Vendor|Size"
Vendor: VMware   Product: Virtual disk     Revision: 1.0  Serial No:  
Size: 8.59GB <8589934080 bytes>

So far so good, and this is where the problems start. Instead of saving this into a string, it saves it as 'unicode' object. From this point on, I'm using string manipulations and regex patterns but none of them work, I can't strip any newline characters, I can't do a split using regex pattern etc. I can't even forcefully convert it into a string.

Adding problematic part of the code with prints:

        iostat_cmd = client.executeCmd('iostat -En '+disk+'|egrep \'Vendor|Size\'')
        iostat_cmd = iostat_cmd.split(r'\s\s+')
        print iostat_cmd
        print type(iostat_cmd)
jvm 3    | [u'Vendor: VMware   Product: Virtual disk     Revision: 1.0  Serial No:      \r\nSize: 8.59GB <8589934080 bytes>']
jvm 3    | <type 'list'>

Basically, I want to remove newline and carriage return. Then, I want to split the string into a list using \s\s+ regex pattern (which is, 2 or more whitespaces), and then return values back to the application. Please note that I have tested this pattern online and it should work.

También intenté así:

        iostat_cmd = client.executeCmd('iostat -En '+disk+'|egrep \'Vendor|Size\'')
        iostat_cmd = str(iostat_cmd)
        print iostat_cmd
        print type(iostat_cmd)
jvm 3    | Vendor: VMware   Product: Virtual disk     Revision: 1.0  Serial No:  
jvm 3    | Size: 8.59GB <8589934080 bytes>
jvm 3    | <type 'str'>
        iostat_cmd = iostat_cmd.replace(r'\r',' ')
        print iostat_cmd
        print type(iostat_cmd)
jvm 3    | Vendor: VMware   Product: Virtual disk     Revision: 1.0  Serial No:  
jvm 3    | Size: 8.59GB <8589934080 bytes>
jvm 3    | <type 'str'>
        iostat_cmd = iostat_cmd.split(r'\s\s+')
        print iostat_cmd
        print type(iostat_cmd)
jvm 3    | ['Vendor: VMware   Product: Virtual disk     Revision: 1.0  Serial No:    \r\nSize: 8.59GB <8589934080 bytes>']
jvm 3    | <type 'list'>

Any ideas what I'm doing wrong? I can't seem to grasp it, I've done it like this for years now. Why does it save the string into an unicode object and why doesn't it split it using the pattern, nor remove the characters using replace function?

preguntado el 28 de mayo de 14 a las 14:05

1 Respuestas

Nothing wrong with a unicode object, the problem here is that str.split doesn't take regex, only a list of delimiters, you need re:

>>> import re
>>> iostat_cmd = u'Vendor: VMware   Product: Virtual disk     Revision: 1.0  Serial No:      \r\nSize: 8.59GB <8589934080 bytes>'
>>> re.split(r'\s\s+', iostat_cmd)
[u'Vendor: VMware', u'Product: Virtual disk', u'Revision: 1.0', u'Serial No:', u'Size: 8.59GB <8589934080 bytes>']

contestado el 28 de mayo de 14 a las 15:05

You are absolutely right. Please excuse me while I step outside and shoot myself. - XV

No need to shoot yourself, you could accept the answer though? :) - Chris Clarke

No es la respuesta que estás buscando? Examinar otras preguntas etiquetadas or haz tu propia pregunta.