Tags down


Melting an R data.table with a factor column

By : Worood
Date : October 17 2020, 01:08 AM
around this issue Staying within data.table, after your suggested approach of using melt, you can tstrsplit to split the variable based on the "_" character.
code :
## use tstrsplit to split a column on a regular expression
dt[, c("xy", "type") := tstrsplit(variable, "_")]
#       ID variable value xy type
#  1: 05AC      x_A  0.81  x    A
#  2: 01BA      x_A  0.41  x    A
#  3: Z1AC      x_A  0.41  x    A
#  4: B2BA      x_A  0.21  x    A
#  5: 05AC      y_A  3.00  y    A
#  6: 01BA      y_A  5.00  y    A
#  7: Z1AC      y_A  5.00  y    A
#  8: B2BA      y_A  6.50  y    A
#  9: 05AC      x_B  0.92  x    B
# 10: 01BA      x_B  0.63  x    B
# 11: Z1AC      x_B  0.58  x    B
# 12: B2BA      x_B  1.00  x    B
# 13: 05AC      y_B  2.05  y    B
# 14: 01BA      y_B  1.80  y    B
# 15: Z1AC      y_B  1.80  y    B
# 16: B2BA      y_B  1.80  y    B
dcast(dt, formula = ID + type ~ xy)

#      ID type    x    y
# 1: 01BA    A 0.41 5.00
# 2: 01BA    B 0.63 1.80
# 3: 05AC    A 0.81 3.00
# 4: 05AC    B 0.92 2.05
# 5: B2BA    A 0.21 6.50
# 6: B2BA    B 1.00 1.80
# 7: Z1AC    A 0.41 5.00
# 8: Z1AC    B 0.58 1.80

Share : facebook icon twitter icon

What is the most efficient way to split a combined factor column into two factor columns in an r data.table?

By : Jeff MacKinder
Date : March 29 2020, 07:55 AM
Does that help I have a large data.table (9 M lines) with two columns: fcombined and value fcombined is a factor, but its actually the result of interacting two factors. The question now is what is the most efficient way to split up the one factor column in two again? I have already come up with a solution that works ok, but maybe there is more straight forward way that i have missed. The working example is: , I think this does the trick and is about 5x faster.
code :
setkey(DT, fcombined)
DT[DT[, data.table(fcombined = levels(fcombined),
                   do.call(rbind, strsplit(levels(fcombined), "_")))]]

Converting character to factor using lapply with no melting

By : Gabriel V. Cardoso
Date : March 29 2020, 07:55 AM
it should still fix some issue I have a list of character matrices and would like to convert two of the columns (lat, lon) to factor. I've tried using lapply for this and it works, but it also reshapes my data frames. I've tried using as.factor two ways: one on just the two desired columns (not good, returns all other columns as NA) and one on the entire data frame but reshaping occurs in both instances. I then tried to melt my list of matrices back to the original, desired shape, but thought that it might be better to not create the original problem rather than trying to fix it after the fact. Any ideas on how to convert to factor without the reshaping occurring? , You can transfrom the list of two matrices with
code :
lapply(mytest, as.data.frame)

data.table::melt - variable column converted to factor with variable.factor = FALSE specified

By : INdaba Ndaba
Date : March 29 2020, 07:55 AM
Hope this helps The reason this is happening, is because df is a dataframe. In such cases melt from falls back to the behavior of melt from which doesn't have a variable.factor argument.
You can see this in the source coude of data.table::melt:
code :
> data.table::melt
function (data, ..., na.rm = FALSE, value.name = "value") 
    if (is.data.table(data)) 
        UseMethod("melt", data)
    else reshape2::melt(data, ..., na.rm = na.rm, value.name = value.name)
<bytecode: 0x10f886b88>
<environment: namespace:data.table>
  id.vars = c("bloc", "name"),
  variable.name = "time",
  value.name = "severity",
  variable.factor = FALSE
) %>% str()
'data.frame': 132 obs. of  4 variables:
 $ bloc    : int  1 1 2 2 3 3 4 4 5 5 ...
 $ name    : chr  "Cristina" "Robijn" "Robijn" "Cristina" ...
 $ time    : Factor w/ 11 levels "d1","d7","d10",..: 1 1 1 1 1 1 1 1 1 1 ...
 $ severity: num  0 0 0 0 0 0 0 0 0 0 ...
  id.vars = c("bloc", "name"),
  variable.name = "time",
  value.name = "severity",
  variable.factor = FALSE
) %>% str()
Classes ‘data.table’ and 'data.frame':    132 obs. of  4 variables:
 $ bloc    : int  1 1 2 2 3 3 4 4 5 5 ...
 $ name    : chr  "Cristina" "Robijn" "Robijn" "Cristina" ...
 $ time    : chr  "d1" "d1" "d1" "d1" ...
 $ severity: num  0 0 0 0 0 0 0 0 0 0 ...
 - attr(*, ".internal.selfref")=<externalptr>

Replacing NAs witha new factor level in one column based on factor level in another column using data.table

By : Francesco Queirolo
Date : March 29 2020, 07:55 AM
should help you out We can specify the logical condition in i and assign those values in 'col_2' that corresponds to the condition with 'yet_another_stuff'
code :
DATA[is.na(col_2) & col_1 == "C", col_2 := "yet_another_stuff"]

Merge two large data.tables based on column name of one table and column value of the other without melting

By : user3437323
Date : March 29 2020, 07:55 AM
I hope this helps . I've got two large data.tables DT1 (2M rows x 300 cols) and DT2 (50M rows x 2 cols) and i would like to merge the values of DT1 columns to a new column in DT2 based on the name of the column specified in a DT2 column. I'd like to achieve this without having to melt DT1, and by using data.table operations only, if possible. Hora, a sample dataset. , Using set():
code :
setkey(DT1, "ID")
setkey(DT2, "ID")
for (k in names(DT1)[-1]) {
  rows <- which(DT2[["col"]] == k)
  set(DT2, i = rows, j = "col_value", DT1[DT2[rows], ..k])

   ID  col col_value
1:  A col1         1
2:  A col4        13
3:  B col2         6
4:  B col3        10
5:  C col1         3
Related Posts Related Posts :
  • Parsing a web page with stringr
  • R error: Namespace load failed for in readRDS(nsInfoFilePath)
  • How to pass Thunderforest API key to the R ggspatial package to create a map
  • Get value of named character vector element when name is NA
  • Fastest way to extract date from date time in R
  • Passing Vectors as Arguments in functions R
  • Non-equi join for a group of variables without providing intervals
  • Regression model function (with user selected variables) on subset of data frame
  • Extracting character probabilities that were randomly sampled in R
  • Factors and dummies in R regressions
  • Axis tick marks re positioning in R
  • R: using factor variables in nlme function
  • Make a factor variable out of few data.frame columns
  • Overlay overall distribution graph with segment wise distribution
  • Cross product of vector
  • How to store loop output of each iteration to data frame
  • Acquire factors for each level of a character vector
  • can I estimate a time varying seasonal effect in R with GAMM?
  • SD value not showed in dplyr
  • Use milliseconds in variable Time with R
  • Why does R.predict.svm return a list of the wrong size?
  • ggmap + ggplot will not plot certain values
  • How to stop for loop from printing results in R
  • Restructuring DataFrame Based on Single Column Values
  • How to split data.frame to equal columns
  • Replace NAs in vector (A) with specific values from another vector (B) and force the copied value in vector (B) to NAs
  • How to add an in memory png image to a plot?
  • R: Read in random rows from file using fread or equivalent?
  • selectInput is not updated properly in R Shiny
  • Use of for loop to delete rows of specific instances in R
  • How to plot the output from an nls model fit in ggplot2
  • Strptime my table gives me NA
  • Scale circle size Venn diagram by relative proportion
  • How to scrape this links with follow_link in R?
  • Use GET function to run results from a loop
  • How would you run a loop to randomize a community matrix and store them?
  • How to add secondary Y axis in ggplot in R?
  • heatmap with values (ggplot2)--how to make cells square and automatically sized?
  • R piped inner join not working
  • scraping table with rvest (XHR file)
  • Function to return the mean of type numeric
  • Adding a column to custom piped function
  • How to represent categorical variable vs Continuous variable using ggplot?
  • How to Export Each Grouped Table in a List of Tables to a Different Excel Tab Using ReadXL and Tidyverse
  • How to follow group by time
  • Function with a for loop to create a column with values 1:n conditioned by intervals matched by another column
  • Assigning 40 shapes or more in scale_shape_manual
  • install.keras() in RStudio fails with http connection error
  • How to pass a dataframe slice to histogram function for mode normalisation in R?
  • How to manipulate a community diversity profile
  • r igraph - Identify ties of nodes to a subgraph regardless of affiliation to said subgraph
  • Display a rectangle in ggplot with x axis in date format
  • Merging two Dataframes in R by ID, One is the subset of the other
  • How do I apply conditions on a particular group element and find permutations from another group in the same table?
  • how to add into an existing column from another column in R
  • fileInput not returning any dataframe
  • Change dataframe values R using different column name provided?
  • error calling combine function loop foreach in R
  • Find mean for sorted top n transactions
  • Finding the largest number in a vector which is smaller than specific value
  • shadow
    Privacy Policy - Terms - Contact Us © soohba.com