I would like to check if the text of a variable contains some geographical reference. I have created a dictionary with all the municipalities I’m interested in. My goal would be to have a dummy variable capturing whether the text of the variable includes any word included in the dictionary. Can you help me with that? I know it isprobably very easy but I’m struggling to do it.
This is my basic code
require(quanteda) require(readtext) y <- read_excel("Municipalities.xls") x <- read.csv("PQs.csv", stringsAsFactors=FALSE) corpusPQs <- corpus(x, text_field = "description") corpusMun <- corpus(y, text_field = "name") corpusMun_l <- as.list(corpusMun) dictGeo <- dictionary(corpusMun_l) geoDfm <- dfm(corpusPQs, dictionary = dictGeo)
Thank you very much