I primarily use R for statistics. I can also use it (installed) on our secure system. I generate spreadsheets with keyed information - then "unkey" when it's moved to the secure system. It is extremely labor intensive to replace known keys with unkeyed information. I thought it would be relatively easy in R using a function (and looping through all excel files from a given project). I am having difficulty editing the xml file holding the keyed info. I can "find" this file by unzipping the xlsx file. I can see the elements in R
> doc <- xmlTreeParse("sharedStrings.xml", useInternalNodes = TRUE)
This gives me the xml code with the strings I want to replace with real values (Sorry! I can't paste here because it interprets the xml!).
So - I would like to have my function find "DDR022-02-S2" for instance and replace it with real unkeyed value. I could then save it over the original xml, then rezip to xlsx and not loose all the formatting and other information in the spreadsheet.
I have tried numerous examples here and I don't know if it's my syntax or the schema or what, but using xml_find_all, I can't seem to find any text nodes, so I can seem to use gsub to change values in the file (unexpected node type). I tried converting using xml_text but it cannot coerce type.... Anyway, I've been at this for almost 2 days and I suspect the answer is less complicated than I'm making it, but I don't routinely (or ever) use or parse xml. I'm decent with Matlab and R, but I really could use help figuring out what R function(ality) could search this xml file and replace a given text string?
I have manually edited using notepad++ and I can rezip and the sheet looks like it should (with the replaced text). With many, many files, doing this in R would save many, many hours!
Thanks for any help!
I primarily use R for statistics. I can also use it (installed) on our secure system. I generate spreadsheets with keyed information - then "unkey" when it's moved to the secure system. It is extremely labor intensive to replace known keys with unkeyed information. I thought it would be relatively easy in R using a function (and looping through all excel files from a given project). I am having difficulty editing the xml file holding the keyed info. I can "find" this file by unzipping the xlsx file. I can see the elements in R
> doc <- xmlTreeParse("sharedStrings.xml", useInternalNodes = TRUE)
This gives me the xml code with the strings I want to replace with real values (Sorry! I can't paste here because it interprets the xml!).
So - I would like to have my function find "DDR022-02-S2" for instance and replace it with real unkeyed value. I could then save it over the original xml, then rezip to xlsx and not loose all the formatting and other information in the spreadsheet.
I have tried numerous examples here and I don't know if it's my syntax or the schema or what, but using xml_find_all, I can't seem to find any text nodes, so I can seem to use gsub to change values in the file (unexpected node type). I tried converting using xml_text but it cannot coerce type.... Anyway, I've been at this for almost 2 days and I suspect the answer is less complicated than I'm making it, but I don't routinely (or ever) use or parse xml. I'm decent with Matlab and R, but I really could use help figuring out what R function(ality) could search this xml file and replace a given text string?
I have manually edited using notepad++ and I can rezip and the sheet looks like it should (with the replaced text). With many, many files, doing this in R would save many, many hours!
Thanks for any help!
Share Improve this question asked Mar 11 at 14:23 user2299029user2299029 313 bronze badges 4 |1 Answer
Reset to default 1I know too little about xml.... Found this: stackoverflow/questions/64243628/….
Looks like I had a namespace...
xml_ns(doc) output "d1"
now running
>xml_find_all(doc, "//d1:t") #finds the text nodes!
发布者:admin,转转请注明出处:http://www.yc00.com/questions/1744788621a4593778.html
.xml
file as a text file, usingreadLines() |> stringr::str_replace_all(..) |> writeLines()
or similar. – r2evans Commented Mar 11 at 14:36ns <- xml_ns_rename(xml_ns(doc), d1 = "ns")
. So this is your solutionlibrary(xml2); xml_doc <- read_xml("<sst xmlns='https://schemas.openxmlformats./spreadsheetml/2006/main' count='12' uniqueCount='6'><si><t>DDR022-02-S2</t></si><si><t>D2232-15-S1</t></si><si><t>MP223-21-S2</t></si></sst>"); text_nodes <- xml_find_all(xml_doc, "//d1:t", xml_ns(xml_doc)); xml_text(text_nodes) <- gsub("DDR022-02-S2", "replacement", xml_text(text_nodes), fixed = TRUE); cat(as.character(xml_doc))
– Tim G Commented Mar 11 at 18:01