#Data_management

From ''for()'' loops to the ''split-apply-combine'' paradigm for column-wise tasks: the transition for a dinosaur

Published at December 11, 2020 ·  9 min read

I have been involved with data crunching for 30 years, and, due to my age, I see myself as a dinosaur within the R-users community. I must admit, I’m rather slow to incorporate new paradigms in my programming workflow … I’m pretty busy and the time I save today is often more important than the time I could save in the future, by picking up new techniques. However, resisting to progress is not necessarily a good idea and, from time to time, also a dinosaur feels like living more dangerously and exploring new ideas and views.

...


Some everyday data tasks: a few hints with R (revisited)

Published at January 28, 2020 ·  12 min read

One year ago, I published a post titled ‘Some everyday data tasks: a few hints with R’. In that post, I considered four data tasks, that we all need to accomplish daily, i.e.

  1. subsetting
  2. sorting
  3. casting
  4. melting

In that post, I used the methods I was more familiar with. And, as a long-time R user, I have mainly incorporated in my workflow all the functions from the base R implementation.

...


Some everyday data tasks: a few hints with R

Published at March 27, 2019 ·  9 min read

We all work with data frames and it is important that we know how we can reshape them, as necessary to meet our needs. I think that there are, at least, four routine tasks that we need to be able to accomplish:

  1. subsetting
  2. sorting
  3. casting
  4. melting

Obviously, there is a wide array of possibilities; I’ll just mention a few, which I regularly use.

Subsetting the data

Subsetting means selecting the records (rows) or the variables (columns) which satisfy certain criteria. Let’s take the ‘students.csv’ dataset, which is available on one of my repositories. It is a database of student’s marks in a series of exams for different subjects.

...