How to combine multiple text files of different lengths and multiple columns by a column












3















I have 60 text files of different lengths and same column names.



For example:



cat Sample_145_Chimeric.out.junction.new.back_spliced_junction.bed.Circexplorer2.txt | gawk '{print $14}' | sort | uniq -c
19258 circRNA
612 ciRNA

cat Sample_146_Chimeric.out.junction.new.back_spliced_junction.bed.Circexplorer2.txt | gawk '{print $14}' | sort | uniq -c
17791 circRNA
729 ciRNA

cat Sample_147_Chimeric.out.junction.new.back_spliced_junction.bed.Circexplorer2.txt | gawk '{print $14}' | sort | uniq -c
22838 circRNA
686 ciRNA

cat Sample_148_Chimeric.out.junction.new.back_spliced_junction.bed.Circexplorer2.txt | gawk '{print $14}' | sort | uniq -c
19404 circRNA
475 ciRNA


I want to produce a 'master' table of all identified circRNAs, with readnumber as column for each sample and flankintronas rownames:



screen shot of the file










share|improve this question









New contributor




grt is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.

























    3















    I have 60 text files of different lengths and same column names.



    For example:



    cat Sample_145_Chimeric.out.junction.new.back_spliced_junction.bed.Circexplorer2.txt | gawk '{print $14}' | sort | uniq -c
    19258 circRNA
    612 ciRNA

    cat Sample_146_Chimeric.out.junction.new.back_spliced_junction.bed.Circexplorer2.txt | gawk '{print $14}' | sort | uniq -c
    17791 circRNA
    729 ciRNA

    cat Sample_147_Chimeric.out.junction.new.back_spliced_junction.bed.Circexplorer2.txt | gawk '{print $14}' | sort | uniq -c
    22838 circRNA
    686 ciRNA

    cat Sample_148_Chimeric.out.junction.new.back_spliced_junction.bed.Circexplorer2.txt | gawk '{print $14}' | sort | uniq -c
    19404 circRNA
    475 ciRNA


    I want to produce a 'master' table of all identified circRNAs, with readnumber as column for each sample and flankintronas rownames:



    screen shot of the file










    share|improve this question









    New contributor




    grt is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
    Check out our Code of Conduct.























      3












      3








      3








      I have 60 text files of different lengths and same column names.



      For example:



      cat Sample_145_Chimeric.out.junction.new.back_spliced_junction.bed.Circexplorer2.txt | gawk '{print $14}' | sort | uniq -c
      19258 circRNA
      612 ciRNA

      cat Sample_146_Chimeric.out.junction.new.back_spliced_junction.bed.Circexplorer2.txt | gawk '{print $14}' | sort | uniq -c
      17791 circRNA
      729 ciRNA

      cat Sample_147_Chimeric.out.junction.new.back_spliced_junction.bed.Circexplorer2.txt | gawk '{print $14}' | sort | uniq -c
      22838 circRNA
      686 ciRNA

      cat Sample_148_Chimeric.out.junction.new.back_spliced_junction.bed.Circexplorer2.txt | gawk '{print $14}' | sort | uniq -c
      19404 circRNA
      475 ciRNA


      I want to produce a 'master' table of all identified circRNAs, with readnumber as column for each sample and flankintronas rownames:



      screen shot of the file










      share|improve this question









      New contributor




      grt is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.












      I have 60 text files of different lengths and same column names.



      For example:



      cat Sample_145_Chimeric.out.junction.new.back_spliced_junction.bed.Circexplorer2.txt | gawk '{print $14}' | sort | uniq -c
      19258 circRNA
      612 ciRNA

      cat Sample_146_Chimeric.out.junction.new.back_spliced_junction.bed.Circexplorer2.txt | gawk '{print $14}' | sort | uniq -c
      17791 circRNA
      729 ciRNA

      cat Sample_147_Chimeric.out.junction.new.back_spliced_junction.bed.Circexplorer2.txt | gawk '{print $14}' | sort | uniq -c
      22838 circRNA
      686 ciRNA

      cat Sample_148_Chimeric.out.junction.new.back_spliced_junction.bed.Circexplorer2.txt | gawk '{print $14}' | sort | uniq -c
      19404 circRNA
      475 ciRNA


      I want to produce a 'master' table of all identified circRNAs, with readnumber as column for each sample and flankintronas rownames:



      screen shot of the file







      command-line






      share|improve this question









      New contributor




      grt is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.











      share|improve this question









      New contributor




      grt is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.









      share|improve this question




      share|improve this question








      edited 3 hours ago









      dessert

      25.1k673106




      25.1k673106






      New contributor




      grt is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.









      asked 5 hours ago









      grtgrt

      16




      16




      New contributor




      grt is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.





      New contributor





      grt is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.






      grt is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.






















          1 Answer
          1






          active

          oldest

          votes


















          3
















          If all of the columns in all of the files are in the same order, then just concat them together with >>:



          for x in {1..60}; do 
          # These flags for tail just cut of the top line, which is your headers
          tail -n 2 Sample_$x_blah.txt >> Sample_master.txt
          # and the double carat makes the output append^
          done


          If not, then you can write the translations in awk sort of like you had above, i.e.



          $ cat Sample_1.txt 
          col1,col2,col3,col4 #etc
          $ cat Sample_2.txt
          col4,col3,col2,col1
          $ cat Sample_1.txt > Sample_Master.txt # no translation needed
          $ awk '{print $4","$3","$2","$1 }' Sample_2.txt >> Sample_Master.txt


          But with 60 files, that would be more work than- something like writing a python script using python's csv lib...






          share|improve this answer

























            Your Answer








            StackExchange.ready(function() {
            var channelOptions = {
            tags: "".split(" "),
            id: "89"
            };
            initTagRenderer("".split(" "), "".split(" "), channelOptions);

            StackExchange.using("externalEditor", function() {
            // Have to fire editor after snippets, if snippets enabled
            if (StackExchange.settings.snippets.snippetsEnabled) {
            StackExchange.using("snippets", function() {
            createEditor();
            });
            }
            else {
            createEditor();
            }
            });

            function createEditor() {
            StackExchange.prepareEditor({
            heartbeatType: 'answer',
            autoActivateHeartbeat: false,
            convertImagesToLinks: true,
            noModals: true,
            showLowRepImageUploadWarning: true,
            reputationToPostImages: 10,
            bindNavPrevention: true,
            postfix: "",
            imageUploader: {
            brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
            contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
            allowUrls: true
            },
            onDemand: true,
            discardSelector: ".discard-answer"
            ,immediatelyShowMarkdownHelp:true
            });


            }
            });






            grt is a new contributor. Be nice, and check out our Code of Conduct.










            draft saved

            draft discarded


















            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2faskubuntu.com%2fquestions%2f1128946%2fhow-to-combine-multiple-text-files-of-different-lengths-and-multiple-columns-by%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown

























            1 Answer
            1






            active

            oldest

            votes








            1 Answer
            1






            active

            oldest

            votes









            active

            oldest

            votes






            active

            oldest

            votes









            3
















            If all of the columns in all of the files are in the same order, then just concat them together with >>:



            for x in {1..60}; do 
            # These flags for tail just cut of the top line, which is your headers
            tail -n 2 Sample_$x_blah.txt >> Sample_master.txt
            # and the double carat makes the output append^
            done


            If not, then you can write the translations in awk sort of like you had above, i.e.



            $ cat Sample_1.txt 
            col1,col2,col3,col4 #etc
            $ cat Sample_2.txt
            col4,col3,col2,col1
            $ cat Sample_1.txt > Sample_Master.txt # no translation needed
            $ awk '{print $4","$3","$2","$1 }' Sample_2.txt >> Sample_Master.txt


            But with 60 files, that would be more work than- something like writing a python script using python's csv lib...






            share|improve this answer






























              3
















              If all of the columns in all of the files are in the same order, then just concat them together with >>:



              for x in {1..60}; do 
              # These flags for tail just cut of the top line, which is your headers
              tail -n 2 Sample_$x_blah.txt >> Sample_master.txt
              # and the double carat makes the output append^
              done


              If not, then you can write the translations in awk sort of like you had above, i.e.



              $ cat Sample_1.txt 
              col1,col2,col3,col4 #etc
              $ cat Sample_2.txt
              col4,col3,col2,col1
              $ cat Sample_1.txt > Sample_Master.txt # no translation needed
              $ awk '{print $4","$3","$2","$1 }' Sample_2.txt >> Sample_Master.txt


              But with 60 files, that would be more work than- something like writing a python script using python's csv lib...






              share|improve this answer




























                3












                3








                3









                If all of the columns in all of the files are in the same order, then just concat them together with >>:



                for x in {1..60}; do 
                # These flags for tail just cut of the top line, which is your headers
                tail -n 2 Sample_$x_blah.txt >> Sample_master.txt
                # and the double carat makes the output append^
                done


                If not, then you can write the translations in awk sort of like you had above, i.e.



                $ cat Sample_1.txt 
                col1,col2,col3,col4 #etc
                $ cat Sample_2.txt
                col4,col3,col2,col1
                $ cat Sample_1.txt > Sample_Master.txt # no translation needed
                $ awk '{print $4","$3","$2","$1 }' Sample_2.txt >> Sample_Master.txt


                But with 60 files, that would be more work than- something like writing a python script using python's csv lib...






                share|improve this answer

















                If all of the columns in all of the files are in the same order, then just concat them together with >>:



                for x in {1..60}; do 
                # These flags for tail just cut of the top line, which is your headers
                tail -n 2 Sample_$x_blah.txt >> Sample_master.txt
                # and the double carat makes the output append^
                done


                If not, then you can write the translations in awk sort of like you had above, i.e.



                $ cat Sample_1.txt 
                col1,col2,col3,col4 #etc
                $ cat Sample_2.txt
                col4,col3,col2,col1
                $ cat Sample_1.txt > Sample_Master.txt # no translation needed
                $ awk '{print $4","$3","$2","$1 }' Sample_2.txt >> Sample_Master.txt


                But with 60 files, that would be more work than- something like writing a python script using python's csv lib...







                share|improve this answer














                share|improve this answer



                share|improve this answer








                edited 3 hours ago









                dessert

                25.1k673106




                25.1k673106










                answered 4 hours ago









                rm-vandarm-vanda

                2,29821323




                2,29821323






















                    grt is a new contributor. Be nice, and check out our Code of Conduct.










                    draft saved

                    draft discarded


















                    grt is a new contributor. Be nice, and check out our Code of Conduct.













                    grt is a new contributor. Be nice, and check out our Code of Conduct.












                    grt is a new contributor. Be nice, and check out our Code of Conduct.
















                    Thanks for contributing an answer to Ask Ubuntu!


                    • Please be sure to answer the question. Provide details and share your research!

                    But avoid



                    • Asking for help, clarification, or responding to other answers.

                    • Making statements based on opinion; back them up with references or personal experience.


                    To learn more, see our tips on writing great answers.




                    draft saved


                    draft discarded














                    StackExchange.ready(
                    function () {
                    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2faskubuntu.com%2fquestions%2f1128946%2fhow-to-combine-multiple-text-files-of-different-lengths-and-multiple-columns-by%23new-answer', 'question_page');
                    }
                    );

                    Post as a guest















                    Required, but never shown





















































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown

































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown







                    Popular posts from this blog

                    Liste der Baudenkmale in Friedland (Mecklenburg)

                    Single-Malt-Whisky

                    Czorneboh