One place for hosting & domains

      September 2019

      Understanding init in Go


      Introduction

      In Go, the predefined init() function sets off a piece of code to run before any other part of your package. This code will execute as soon as the package is imported, and can be used when you need your application to initialize in a specific state, such as when you have a specific configuration or set of resources with which your application needs to start. It is also used when importing a side effect, a technique used to set the state of a program by importing a specific package. This is often used to register one package with another to make sure that the program is considering the correct code for the task.

      Although init() is a useful tool, it can sometimes make code difficult to read, since a hard-to-find init() instance will greatly affect the order in which the code is run. Because of this, it is important for developers who are new to Go to understand the facets of this function, so that they can make sure to use init() in a legible manner when writing code.

      In this tutorial, you’ll learn how init() is used for the setup and initialization of specific package variables, one time computations, and the registration of a package for use with another package.

      Prerequisites

      For some of the examples in this article, you will need:

      .
      ├── bin 
      │ 
      └── src
          └── github.com
              └── gopherguides
      

      Declaring init()

      Any time you declare an init() function, Go will load and run it prior to anything else in that package. To demonstrate this, this section will walk through how to define an init() function and show the effects on how the package runs.

      Let’s first take the following as an example of code without the init() function:

      main.go

      package main
      
      import "fmt"
      
      var weekday string
      
      func main() {
          fmt.Printf("Today is %s", weekday)
      }
      

      In this program, we declared a global variable called weekday. By default, the value of weekday is an empty string.

      Let’s run this code:

      Because the value of weekday is blank, when we run the program, we will get the following output:

      Output

      Today is

      We can fill in the blank variable by introducing an init() function that initializes the value of weekday to the current day. Add in the following highlighted lines to main.go:

      main.go

      package main
      
      import (
          "fmt"
          "time"
      )
      
      var weekday string
      
      func init() {
          weekday = time.Now().Weekday().String()
      }
      
      func main() {
          fmt.Printf("Today is %s", weekday)
      }
      

      In this code, we imported and used the time package to get the current day of the week (Now().Weekday().String()), then used init() to initialize weekday with that value.

      Now when we run the program, it will print out the current weekday:

      Output

      Today is Monday

      While this illustrates how init() works, a much more typical use case for init() is to use it when importing a package. This can be useful when you need to do specific setup tasks in a package before you want the package to be used. To demonstrate this, let’s create a program that will require a specific initialization for the package to work as intended.

      Initializing Packages on Import

      First, we will write some code that selects a random creature from a slice and prints it out. However, we won’t use init() in our initial program. This will better show the problem we have, and how init() will solve our problem.

      From within your src/github.com/gopherguides/ directory, create a folder called creature with the following command:

      Inside the creature folder, create a file called creature.go:

      • nano creature/creature.go

      In this file, add the following contents:

      creature.go

      package creature
      
      import (
          "math/rand"
      )
      
      var creatures = []string{"shark", "jellyfish", "squid", "octopus", "dolphin"}
      
      func Random() string {
          i := rand.Intn(len(creatures))
          return creatures[i]
      }
      

      This file defines a variable called creatures that has a set of sea creatures initialized as values. It also has an exported Random function that will return a random value from the creatures variable.

      Save and quit this file.

      Next, let’s create a cmd package that we will use to write our main() function and call the creature package.

      At the same file level from which we created the creature folder, create a cmd folder with the following command:

      Inside the cmd folder, create a file called main.go:

      Add the following contents to the file:

      cmd/main.go

      package main
      
      import (
          "fmt"
      
          "github.com/gopherguides/creature"
      )
      
      func main() {
          fmt.Println(creature.Random())
          fmt.Println(creature.Random())
          fmt.Println(creature.Random())
          fmt.Println(creature.Random())
      }
      

      Here we imported the creature package, and then in the main() function, used the creature.Random() function to retrieve a random creature and print it out four times.

      Save and quit main.go.

      We now have our entire program written. However, before we can run this program, we’ll need to also create a couple of configuration files for our code to work properly. Go uses Go Modules to configure package dependencies for importing resources. These modules are configuration files placed in your package directory that tell the compiler where to import packages from. While learning about modules is beyond the scope of this article, we can write just a couple lines of configuration to make this example work locally.

      In the cmd directory, create a file named go.mod:

      Once the file is open, place in the following contents:

      cmd/go.mod

      module github.com/gopherguides/cmd
       replace github.com/gopherguides/creature => ../creature
      

      The first line of this file tells the compiler that the cmd package we created is in fact github.com/gopherguides/cmd. The second line tells the compiler that github.com/gopherguides/creature can be found locally on disk in the ../creature directory.

      Save and close the file. Next, create a go.mod file in the creature directory:

      Add the following line of code to the file:

      creature/go.mod

       module github.com/gopherguides/creature
      

      This tells the compiler that the creature package we created is actually the github.com/gopherguides/creature package. Without this, the cmd package would not know where to import this package from.

      Save and quit the file.

      You should now have the following directory structure and file layout:

      ├── cmd
      │   ├── go.mod
      │   └── main.go
      └── creature
          ├── go.mod
          └── creature.go
      

      Now that we have all the configuration completed, we can run the main program with the following command:

      This will give:

      Output

      jellyfish squid squid dolphin

      When we ran this program, we received four values and printed them out. If we run the program multiple times, we will notice that we always get the same output, rather than a random result as expected. This is because the rand package creates pseudorandom numbers that will consistently generate the same output for a single initial state. To achieve a more random number, we can seed the package, or set a changing source so that the initial state is different every time we run the program. In Go, it is common to use the current time to seed the rand package.

      Since we want the creature package to handle the random functionality, open up this file:

      • nano creature/creature.go

      Add the following highlighted lines to the creature.go file:

      creature/creature.go

      package creature
      
      import (
          "math/rand"
          "time"
      )
      
      var creatures = []string{"shark", "jellyfish", "squid", "octopus", "dolphin"}
      
      func Random() string {
          rand.Seed(time.Now().UnixNano())
          i := rand.Intn(len(creatures))
          return creatures[i]
      }
      

      In this code, we imported the time package and used Seed() to seed the current time. Save and exit the file.

      Now, when we run the program we will get a random result:

      Output

      jellyfish octopus shark jellyfish

      If you continue to run the program over and over, you will continue to get random results. However, this is not yet an ideal implementation of our code, because every time creature.Random() is called, it also re-seeds the rand package by calling rand.Seed(time.Now().UnixNano()) again. Re-seeding will increase the chance of seeding with the same initial value if the internal clock has not changed, which will cause possible repetitions of the random pattern, or will increase CPU processing time by having your program wait for the clock to change.

      To fix this, we can use an init() function. Let’s update the creature.go file:

      • nano creature/creature.go

      Add the following lines of code:

      creature/creature.go

      package creature
      
      import (
          "math/rand"
          "time"
      )
      
      var creatures = []string{"shark", "jellyfish", "squid", "octopus", "dolphin"}
      
      func init() {
          rand.Seed(time.Now().UnixNano())
      }
      
      func Random() string {
          i := rand.Intn(len(creatures))
          return creatures[i]
      }
      

      Adding the init() function tells the compiler that when the creature package is imported, it should run the init() function once, providing a single seed for the random number generation. This ensures that we don’t run code more than we have to. Now if we run the program, we will continue to get random results:

      Output

      dolphin squid dolphin octopus

      In this section, we have seen how using init() can ensure that the appropriate calculations or initializations are performed prior to the package being used. Next, we will see how to use multiple init() statements in a package.

      Multiple Instances of init()

      Unlike the main() function that can only be declared once, the init() function can be declared multiple times throughout a package. However, multiple init()s can make it difficult to know which one has priority over the others. In this section, we will show how to maintain control over multiple init() statements.

      In most cases, init() functions will execute in the order that you encounter them. Let’s take the following code as an example:

      main.go

      package main
      
      import "fmt"
      
      func init() {
          fmt.Println("First init")
      }
      
      func init() {
          fmt.Println("Second init")
      }
      
      func init() {
          fmt.Println("Third init")
      }
      
      func init() {
          fmt.Println("Fourth init")
      }
      
      func main() {}
      

      If we run the program with the following command:

      We’ll receive the following output:

      Output

      First init Second init Third init Fourth init

      Notice that each init() is run in the order in which the compiler encounters it. However, it may not always be so easy to determine the order in which the init() function will be called.

      Let’s look at a more complicated package structure in which we have multiple files each with their own init() function declared in them. To illustrate this, we will create a program that shares a variable called message and prints it out.

      Delete the creature and cmd directories and their contents from the earlier section, and replace them with the following directories and file structure:

      ├── cmd
      │   ├── a.go
      │   ├── b.go
      │   └── main.go
      └── message
          └── message.go
      

      Now let’s add the contents of each file. In a.go, add the following lines:

      cmd/a.go

      package main
      
      import (
          "fmt"
      
          "github.com/gopherguides/message"
      )
      
      func init() {
          fmt.Println("a ->", message.Message)
      }
      

      This file contains a single init() function that prints out the value of message.Message from the message package.

      Next, add the following contents to b.go:

      cmd/b.go

      package main
      
      import (
          "fmt"
      
          "github.com/gopherguides/message"
      )
      
      func init() {
          message.Message = "Hello"
          fmt.Println("b ->", message.Message)
      }
      

      In b.go, we have a single init() function that set’s the value of message.Message to Hello and prints it out.

      Next, create main.go to look like the following:

      cmd/main.go

      package main
      
      func main() {}
      

      This file does nothing, but provides an entry point for the program to run.

      Finally, create your message.go file like the following:

      message/message.go

      package message
      
      var Message string
      

      Our message package declares the exported Message variable.

      To run the program, execute the following command from the cmd directory:

      Because we have multiple Go files in the cmd folder that make up the main package, we need to tell the compiler that all the .go files in the cmd folder should be compiled. Using *.go tells the compiler to load all the files in the cmd folder that end in .go. If we issued the command of go run main.go, the program would fail to compile as it would not see the code in the a.go and b.go files.

      This will give the following output:

      Output

      a -> b -> Hello

      Per the Go language specification for Package Initialization, when multiple files are encountered in a package, they are processed alphabetically. Because of this, the first time we printed out message.Message from a.go, the value was blank. The value wasn’t initialized until the init() function from b.go had been run.

      If we were to change the file name of a.go to c.go, we would get a different result:

      Output

      b -> Hello a -> Hello

      Now the compiler encounters b.go first, and as such, the value of message.Message is already initialized with Hello when the init() function in c.go is encountered.

      This behavior could create a possible problem in your code. It is common in software development to change file names, and because of how init() is processed, changing file names may change the order in which init() is processed. This could have the undesired effect of changing your program’s output. To ensure reproducible initialization behavior, build systems are encouraged to present multiple files belonging to the same package in lexical file name order to a compiler. One way to ensure that all init() functions are loaded in order is to declare them all in one file. This will prevent the order from changing even if file names are changed.

      In addition to ensuring the order of your init() functions does not change, you should also try to avoid managing state in your package by using global variables, i.e., variables that are accessible from anywhere in the package. In the preceding program, the message.Message variable was available to the entire package and maintained the state of the program. Because of this access, the init() statements were able to change the variable and destablize the predictability of your program. To avoid this, try to work with variables in controlled spaces that have as little access as possible while still allowing the program to work.

      We have seen that you can have multiple init() declarations in a single package. However, doing so may create undesired effects and make your program hard to read or predict. Avoiding multiple init() statements or keeping them all in one file will ensure that the behavior of your program does not change when files are moved around or names are changed.

      Next, we will examine how init() is used to import with side effects.

      Using init() for Side Effects

      In Go, it is sometimes desirable to import a package not for its content, but for the side effects that occur upon importing the package. This often means that there is an init() statement in the imported code that executes before any of the other code, allowing for the developer to manipulate the state in which their program is starting. This technique is called importing for a side effect.

      A common use case for importing for side effects is to register functionality in your code, which lets a package know what part of the code your program needs to use. In the image package, for example, the image.Decode function needs to know which format of image it is trying to decode (jpg, png, gif, etc.) before it can execute. You can accomplish this by first importing a specific program that has an init() statement side effect.

      Let’s say you are trying to use image.Decode on a .png file with the following code snippet:

      Sample Decoding Snippet

      . . .
      func decode(reader io.Reader) image.Rectangle {
          m, _, err := image.Decode(reader)
          if err != nil {
              log.Fatal(err)
          }
          return m.Bounds()
      }
      . . .
      

      A program with this code will still compile, but any time we try to decode a png image, we will get an error.

      To fix this, we would need to first register an image format for image.Decode. Luckily, the image/png package contains the following init() statement:

      image/png/reader.go

      func init() {
          image.RegisterFormat("png", pngHeader, Decode, DecodeConfig)
      }
      

      Therefore, if we import image/png into our decoding snippet, then the image.RegisterFormat() function in image/png will run before any of our code:

      Sample Decoding Snippet

      . . .
      import _ "image/png"
      . . .
      
      func decode(reader io.Reader) image.Rectangle {
          m, _, err := image.Decode(reader)
          if err != nil {
              log.Fatal(err)
          }
          return m.Bounds()
      }
      

      This will set the state and register that we require the png version of image.Decode(). This registration will happen as a side effect of importing image/png.

      You may have noticed the blank identifier (_) before "image/png". This is needed because Go does not allow you to import packages that are not used throughout the program. By including the blank identifier, the value of the import itself is discarded, so that only the side effect of the import comes through. This means that, even though we never call the image/png package in our code, we can still import it for the side effect.

      It is important to know when you need to import a package for its side effect. Without the proper registration, it is likely that your program will compile, but not work properly when it is run. The packages in the standard library will declare the need for this type of import in their documentation. If you write a package that requires importing for side effect, you should also make sure the init() statement you are using is documented so users that import your package will be able to use it properly.

      Conclusion

      In this tutorial we learned that the init() function loads before the rest of the code in your package is loaded, and that it can perform specific tasks for a package like initializing a desired state. We also learned that the order in which the compiler executes multiple init() statements is dependent on the order in which the compiler loads the source files. If you would like to learn more about init(), check out the official Golang documentation, or read through the discussion in the Go community about the function.

      You can read more about functions with our How To Define and Call Functions in Go article, or explore the entire How To Code in Go series.



      Source link

      How to List Open Files with lsof


      Updated by Linode

      Contributed by

      Mihalis Tsoukalos

      Introduction

      lsof was created by Victor A. Abell and is a utility that lists open files. As everything in Linux can be considered a file, this means that lsof can gather information on the majority of activity on your Linode, including network interfaces and network connections. lsof by default will output a list of all open files and the processes that opened them.

      The two main drawbacks of lsof are that it can only display information about the local machine (localhost), and that it requires administrative privileges to print all available data. Additionally, you usually do not execute lsof without any command line parameters because it outputs a large amount of data that can be difficult to parse. This happens because lsof will natively list all open files belonging to all active processes – for example, the output of wc(1) (a word count utility) when applied to lsof on a test instance shows the size of the output is extremely large:

      sudo lsof | wc
      
        
          7332   68337 1058393
      
      

      Before You Begin

      Note

      Running lsof without root privileges will only return
      the results available to the current user. If you are not familiar with the sudo command,
      see the Users and Groups guide.

      On most major distributions, lsof will come pre-installed and you can begin using it immediately. If for any reason it is not found, you can install lsof using your preferred package manager.

      Command Line Options

      The lsof(8) binary supports a large number of command line options, including the following:

      OptionDescription
      -h and -?Both options present a help screen. Please note that you will need to properly escape the ? character for -? to work.
      -aThis option tells lsof to logically ADD all provided options.
      -bThis option tells lsof to avoid kernel functions that might block the returning of results. This is a very specialized option.
      -lIf converting a user ID to a login name is working improperly or slowly, you can disable it using the -l parameter.
      –PThe -P option prevents the conversion of port numbers to port names for network files.
      -u listThe -u option allows you to define a list of login names or user ID numbers whose files will be returned. The -u option supports the ^ character for excluding the matches from the output.
      -c listThe -c option selects the listing of files for processes executing the commands that begin with the characters in the list. This supports regular expressions, and also supports the ^ character for excluding the matches from the output.
      -p listThe -p option allows you to select the files for the processes whose process IDs are in the list. The -p option supports the ^ character for excluding the matches from the output.
      -g listThe -g option allows you to select the files for the processes whose optional process group IDs are in the list. The -g option supports the ^ character for excluding the matches from the output.
      -sThe -s option allows you to select the network protocols and states that interest you. The -s option supports the ^ character for excluding the matches from the output. The correct form is PROCOTCOL:STATE. Possible protocols are UDP and TCP. Some possible TCP states are: CLOSED, SYN-SENT, SYN-RECEIVED, ESTABLISHED, CLOSE-WAIT, LAST-ACK, FIN-WAIT-1, FIN-WAIT-2, CLOSING, and TIME-WAIT. Possible UDP states are Unbound and Idle.
      +d sThe +d option option tells lsof to search for all open instances of directory s and the files and directories it contains at its top level.
      +D directoryThe +D option tells lsof to search for all open instances of directory directory and all the files and directories it contains to its complete depth.
      -d listThe -d option specifies the list of file descriptors to include or exclude from the output. -d 1,^2 means include file descriptor 1 and exclude file descriptor 2.
      -i4This option is used for displaying IPv4 data only.
      -i6This option is used for displaying IPv6 data only.
      -iThe -i option without any values tells lsof to display network connections only.
      -i ADDRESSThe -i option with a value will limit the displayed information to match that value. Some example values are TCP:25 for displaying TCP data that listens to port number 25, @google.com for displaying information related to google.com, :25 for displaying information related to port number 25, :POP3 for displaying information related to the port number that is associated to POP3 in the /etc/services file, etc. You can also combine hostnames and IP Addresses with port numbers and protocols.
      -tThe -t option tells lsof to display process identifiers without a header line. This is particularly useful for feeding the output of lsof to the kill(1) command or to a script. Notice that -t automatically selects the -w option.
      -wThe -w option disables the suppression of warning messages.
      +wThe +w option enables the suppression of warning messages.
      -r TIMEThe -r option causes the lsof command to repeat every TIME seconds until the command is manually terminated with an interrupt.
      +r TIMEThe +r command, with the + prefix, acts the same as the -r command, but will exit its loop when it fails to find any open files.
      -nThe -n option prevents network numbers from being converted to host names.
      -F CHARACTERThe -F command instructs lsof to produce output that is suitable as input for other programs. For a complete explanation, consult the lsof manual entry.

      Note

      By default, the output of lsof will include the output of each one of its command line options,
      like a big logical expression with multiple OR logical operators between all the command line
      options. However, this default behavior can change with the use of the -a option.

      Note

      For the full list of command line options supported by lsof and a more detailed
      explanation of the presented command line options, you should consult its manual page:

      man lsof
      

      Anatomy of lsof Output

      The following command uses the -i option to display all open UDP files/connections:

      sudo lsof -i UDP
      
        
      COMMAND   PID USER  FD    TYPE DEVICE SIZE/OFF NODE NAME
      rpcbind   660  root  6u    IPv4  20296  0t0      UDP  *:sunrpc
      rpcbind   660  root  7u    IPv4  20298  0t0      UDP  *:836
      rpcbind   660  root  9u    IPv6  20300  0t0      UDP  *:sunrpc
      rpcbind   660  root  10u   IPv6  20301  0t0      UDP  *:836
      avahi-dae 669 avahi   12u   IPv4  20732  0t0      UDP  *:mdns
      avahi-dae 669 avahi   13u   IPv6  20733  0t0      UDP  *:mdns
      avahi-dae 669 avahi   14u   IPv4  20734  0t0      UDP  *:54087
      avahi-dae 669 avahi   15u   IPv6  20735  0t0      UDP  *:48582
      rsyslogd  675  root  6u    IPv4  20973  0t0      UDP  li10-121.members.linode.com:syslog
      dhclient  797  root  6u    IPv4  21828  0t0      UDP  *:bootpc
      ntpd      848   ntp   16u   IPv6  22807  0t0      UDP  *:ntp
      ntpd      848   ntp   17u   IPv4  22810  0t0      UDP  *:ntp
      ntpd      848   ntp   18u   IPv4  22814  0t0      UDP  localhost:ntp
      ntpd      848   ntp   19u   IPv4  22816  0t0      UDP  li10-121.members.linode.com:ntp
      ntpd      848   ntp   20u   IPv6  22818  0t0      UDP  localhost:ntp
      ntpd      848   ntp   24u   IPv6  24916  0t0      UDP  [2a01:7e00::f03c:91ff:fe69:1381]:ntp
      ntpd      848   ntp   25u   IPv6  24918  0t0      UDP  [fe80::f03c:91ff:fe69:1381]:ntp
      
      

      The output of lsof has various columns.

      • The COMMAND column contains the first nine
        characters of the name of the UNIX command associated with the process.
      • The PID column
        shows the process ID of the command.
      • The USER column displays the name of the
        user that owns the process.
      • The TID column shows the task ID. A blank TID indicates a
        process. Note that this column will not appear in the output of many lsof commands.
      • The FD column stands for file descriptor. Its values can be cwd, txt, mem, and
        mmap.
      • The TYPE column displays the type of the file: regular file, directory, socket, etc.
      • The DEVICE column contains the device numbers separated by commas.
      • The value of the SIZE/OFF
        column is the size of the file or the file offset in bytes. The value of the NODE column
        is the node number of a local file.
      • Lastly, the NAME column shows the name of the mount point
        and file system where the file is located, or the Internet address.

      The Repeat Mode

      Running lsof with the –r option puts lsof in repeat mode, re-running the command in a loop every few seconds. This mode is useful for monitoring for a process or a connection that might only exist for a short time. The -r command will run forever, so when you are finished you must manually terminate the command.

      The +r option will also put lsof in repeat mode – the difference between -r and +r is that +r will
      automatically terminate lsof when a loop has no new output to print.

      When lsof
      is in repeat mode, it prints new output every t seconds (a loop); the default value
      of t is 15 seconds, which you can change by typing an integer value after -r or +r.

      The following command tells lsof to display all UDP connections every 10 seconds:

      sudo lsof -r 10 -i UDP
      

      Choosing Between IPv4 and IPv6

      lsof lists both IPv4 and IPv6 connections by default, but you can choose the kind
      of connections you want to display. The following command displays IPv4 connections
      only:

      sudo lsof -i4
      

      Therefore, the next command will display all TCP connections of the IPv4 protocol:

      sudo lsof -i4 -a -i TCP
      

      An equivalent command to the above is the following command that uses grep:

      sudo lsof -i4 | grep TCP
      

      On the other hand, the following command will display IPv6 connections only:

      sudo lsof -i6
      

      Therefore, the next command will display all UDP connections of the IPv6 protocol:

      sudo lsof -i6 | grep UDP
      
        
      avahi-dae  669  avahi  13u  IPv6  20733  0t0  UDP  *:mdns
      avahi-dae  669  avahi  15u  IPv6  20735  0t0  UDP  *:48582
      ntpd       848  ntp    16u  IPv6  22807  0t0  UDP  *:ntp
      ntpd       848  ntp    20u  IPv6  22818  0t0  UDP  localhost:ntp
      ntpd       848  ntp    24u  IPv6  24916  0t0  UDP  [2a01:7e00::f03c:91ff:fe69:1381]:ntp
      ntpd       848  ntp    25u  IPv6  24918  0t0  UDP  [fe80::f03c:91ff:fe69:1381]:ntp
      
      

      Logically ADD All Options

      In this section of the guide you will learn how to logically ADD the existing options
      using the -a flag. This provides you enhanced filtering capabilities. Take the following command as an example:

      sudo lsof -Pni -u www-data
      

      The above command would print out all network connections (-i), suppressing network number conversion (-n) and the conversion of port numbers to port names (-P), and it would also print out all files pertaining to the www-data user, without combining the two options into one logical statement.

      The following command combines these two options with the -a logical AND option and finds all open sockets belonging to the www-data user:

      lsof -Pni -a -u www-data
      
        
      COMMAND  PID    USER      FD  TYPE  DEVICE   SIZE/OFF  NODE  NAME
      apache2  6385   www-data  4u  IPv6  8626153  0t0       TCP   *:80 (LISTEN)
      apache2  6385   www-data  6u  IPv6  8626157  0t0       TCP   *:443 (LISTEN)
      apache2  6386   www-data  4u  IPv6  8626153  0t0       TCP   *:80 (LISTEN)
      apache2  6386   www-data  6u  IPv6  8626157  0t0       TCP   *:443 (LISTEN)
      apache2  6387   www-data  4u  IPv6  8626153  0t0       TCP   *:80 (LISTEN)
      apache2  6387   www-data  6u  IPv6  8626157  0t0       TCP   *:443 (LISTEN)
      apache2  24567  www-data  4u  IPv6  8626153  0t0       TCP   *:80 (LISTEN)
      apache2  24567  www-data  6u  IPv6  8626157  0t0       TCP   *:443 (LISTEN)
      apache2  24570  www-data  4u  IPv6  8626153  0t0       TCP   *:80 (LISTEN)
      apache2  24570  www-data  6u  IPv6  8626157  0t0       TCP   *:443 (LISTEN)
      apache2  24585  www-data  4u  IPv6  8626153  0t0       TCP   *:80 (LISTEN)
      apache2  24585  www-data  6u  IPv6  8626157  0t0       TCP   *:443 (LISTEN)
      apache2  25431  www-data  4u  IPv6  8626153  0t0       TCP   *:80 (LISTEN)
      apache2  25431  www-data  6u  IPv6  8626157  0t0       TCP   *:443 (LISTEN)
      apache2  27827  www-data  4u  IPv6  8626153  0t0       TCP   *:80 (LISTEN)
      apache2  27827  www-data  6u  IPv6  8626157  0t0       TCP   *:443 (LISTEN)
      apache2  27828  www-data  4u  IPv6  8626153  0t0       TCP   *:80 (LISTEN)
      apache2  27828  www-data  6u  IPv6  8626157  0t0       TCP   *:443 (LISTEN)
      apache2  27829  www-data  4u  IPv6  8626153  0t0       TCP   *:80 (LISTEN)
      apache2  27829  www-data  6u  IPv6  8626157  0t0       TCP   *:443 (LISTEN)
      
      

      Note

      You are allowed to place the -a option wherever you like as lsof will still detect the relevant options.

      Using Regular Expressions

      lsof has support for regular expressions. Regular expressions begin and end with a
      forward slash (/) character. The ^ character denotes the beginning of a line whereas $
      denotes the end of the line. Each dot (.) character represents a single character in
      the output.

      The following lsof command will find all commands that have precisely five characters:

      lsof -c /^.....$/
      
        
      COMMAND  PID  USER  FD   TYPE     DEVICE  SIZE/OFF  NODE  NAME
      netns    18   root  cwd  DIR      8,0     4096      2     /
      netns    18   root  rtd  DIR      8,0     4096      2     /
      netns    18   root  txt  unknown                          /proc/18/exe
      jfsIO    210  root  cwd  DIR      8,0     4096      2     /
      jfsIO    210  root  rtd  DIR      8,0     4096      2     /
      jfsIO    210  root  txt  unknown                          /proc/210/exe
      kstrp    461  root  cwd  DIR      8,0     4096      2     /
      kstrp    461  root  rtd  DIR      8,0     4096      2     /
      kstrp    461  root  txt  unknown                          /proc/461/exe
      
      

      Output For Other Programs

      Using the -F option, lsof generates output that is suitable for processing by scripts
      written in programming languages such as awk, perl and python.

      The following command will display each field of the lsof output in a separate line:

      sudo lsof -n -i4 -a -i TCP:ssh -F
      
        
      p812
      g812
      R1
      csshd
      u0
      Lroot
      f3
      au
      l
      tIPv4
      .
      .
      .
      
      

      Providing various arguments to the -F option allows you to generate less output – notice that the process ID
      and the file descriptor are always printed in the output. As an example, the following command
      will only print the process ID, which is preceded by the p character, the file descriptor, which
      is preceded by the f character, and the protocol name of each entry, which is preceded by
      the P character:

      sudo lsof -n -i4 -a -i TCP:ssh -FP
      
        
      p812
      f3
      PTCP
      p22352
      f3
      PTCP
      p22361
      f3
      PTCP
      
      

      Note

      For the full list of options supported by -F, you should visit the manual page of lsof.

      Additional Examples

      Show All Open TCP Files

      Similar to the aforementioned UDP command, the following command will display all open TCP files/connections:

      sudo lsof -i TCP
      
        
      COMMAND   PID     USER     FD   TYPE  DEVICE  SIZE/OFF NODE NAME
      sshd      812     root     3u   IPv4  23674   0t0      TCP  *:ssh (LISTEN)
      sshd      812     root     4u   IPv6  23686   0t0      TCP  *:ssh (LISTEN)
      mysqld    1003    mysql    17u  IPv4  24217   0t0      TCP  localhost:mysql (LISTEN)
      master    1245    root     13u  IPv4  24480   0t0      TCP  *:smtp (LISTEN)
      sshd      22352   root     3u   IPv4  8613370 0t0      TCP  li10-121.members.linode.com:ssh->ppp-2-8-23-19.home.otenet.gr:60032 (ESTABLISHED)
      sshd      22361   mtsouk   3u   IPv4  8613370      0t0      TCP  li10-121.members.linode.com:ssh->ppp-2-8-23-19.home.otenet.gr:60032 (ESTABLISHED)
      apache2   24565   root     4u   IPv6  8626153      0t0      TCP  *:http (LISTEN)
      apache2   24565   root     6u   IPv6  8626157      0t0      TCP  *:https (LISTEN)
      apache2   24567   www-data 4u   IPv6  8626153      0t0      TCP  *:http (LISTEN)
      apache2   24567   www-data 6u   IPv6  8626157      0t0      TCP  *:https (LISTEN)
      apache2   24568   www-data 4u   IPv6  8626153      0t0      TCP  *:http (LISTEN)
      apache2   24568   www-data 6u   IPv6  8626157      0t0      TCP  *:https (LISTEN)
      apache2   24569   www-data 4u   IPv6  8626153      0t0      TCP  *:http (LISTEN)
      apache2   24569   www-data 6u   IPv6  8626157      0t0      TCP  *:https (LISTEN)
      apache2   24570   www-data 4u   IPv6  8626153      0t0      TCP  *:http (LISTEN)
      apache2   24570   www-data 6u   IPv6  8626157      0t0      TCP  *:https (LISTEN)
      apache2   24571   www-data 4u   IPv6  8626153      0t0      TCP  *:http (LISTEN)
      apache2   24571   www-data 6u   IPv6  8626157      0t0      TCP  *:https (LISTEN)
      
      

      Listing All ESTABLISHED Connections

      Internet Connections

      If you process the output of lsof with some traditional UNIX command line tools, like grep and awk,
      you can list all active network connections:

      sudo lsof -i -n -P | grep ESTABLISHED | awk '{print $1, $9}' | sort -u
      
        
      sshd 109.74.193.253:22->2.86.23.29:60032
      
      

      Note

      The lsof -i -n -P command can be also written as lsof -i -nP or alternatively as
      lsof -nPi – writing it as lsof -inP would generate a syntax error because lsof
      thinks that np is a parameter to -i.

      SSH Connections

      The following command finds all established SSH connections to the local machine:

      sudo lsof | grep sshd | grep ESTABLISHED
      
        
      253.members.linode.com:ssh->ppp-2-86-23-29.home.otenet.gr:60032 (ESTABLISHED)
      sshd  22361  mtsouk  3u  IPv4  8613370  0t0  TCP li140-253.members.linode.com:ssh->ppp-2-86-23-29.home.otenet.gr:60032 (ESTABLISHED)
      
      

      The following command produces the same output as the previous command, but will do so more quickly because the -i TCP
      option limits the amount of information lsof prints, which mean that grep will have less data
      to process:

      sudo lsof -i TCP | grep ssh | grep ESTABLISHED
      

      Alternatively, you can execute the following command to find all established SSH
      connections:

      sudo lsof -nP -iTCP -sTCP:ESTABLISHED | grep SSH
      

      Showing Processes that are Listening to a Particular Port

      The following command shows all network connections that listen to port number 22
      (ssh) using either UDP or TCP:

      sudo lsof -i :22
      
        
      COMMAND  PID    USER    FD  TYPE  DEVICE   SIZE/OFF  NODE  NAME
      sshd     812    root    3u  IPv4  23674    0t0       TCP   *:ssh (LISTEN)
      sshd     812    root    4u  IPv6  23686    0t0       TCP   *:ssh (LISTEN)
      sshd     22352  root    3u  IPv4  8613370  0t0       TCP   li140-253.members.linode.com:ssh->ppp-2-86-23-29.home.otenet.gr:60032 (ESTABLISHED)
      sshd     22361  mtsouk  3u  IPv4  8613370  0t0       TCP   li140-253.members.linode.com:ssh->ppp-2-86-23-29.home.otenet.gr:60032 (ESTABLISHED)
      
      

      Determine Which Program Listens to a TCP port

      One of the most frequent uses of lsof is determining which program listens to a given TCP port.
      The following command will print TCP processes that are in the LISTEN state by using the -s option to provide a protocol and protocol state:

      sudo lsof -nP -i TCP -s TCP:LISTEN
      
        
      COMMAND  PID    USER      FD   TYPE  DEVICE   SIZE/OFF  NODE  NAME
      sshd     812    root      3u   IPv4  23674    0t0       TCP   *:22 (LISTEN)
      sshd     812    root      4u   IPv6  23686    0t0       TCP   *:22 (LISTEN)
      mysqld   1003   mysql     17u  IPv4  24217    0t0       TCP   127.0.0.1:3306 (LISTEN)
      master   1245   root      13u  IPv4  24480    0t0       TCP   *:25 (LISTEN)
      apache2  24565  root      4u   IPv6  8626153  0t0       TCP   *:80 (LISTEN)
      apache2  24565  root      6u   IPv6  8626157  0t0       TCP   *:443 (LISTEN)
      apache2  24567  www-data  4u   IPv6  8626153  0t0       TCP   *:80 (LISTEN)
      apache2  24567  www-data  6u   IPv6  8626157  0t0       TCP   *:443 (LISTEN)
      apache2  24568  www-data  4u   IPv6  8626153  0t0       TCP   *:80 (LISTEN)
      apache2  24568  www-data  6u   IPv6  8626157  0t0       TCP   *:443 (LISTEN)
      apache2  24569  www-data  4u   IPv6  8626153  0t0       TCP   *:80 (LISTEN)
      apache2  24569  www-data  6u   IPv6  8626157  0t0       TCP   *:443 (LISTEN)
      apache2  24570  www-data  4u   IPv6  8626153  0t0       TCP   *:80 (LISTEN)
      apache2  24570  www-data  6u   IPv6  8626157  0t0       TCP   *:443 (LISTEN)
      apache2  24571  www-data  4u   IPv6  8626153  0t0       TCP   *:80 (LISTEN)
      apache2  24571  www-data  6u   IPv6  8626157  0t0       TCP   *:443 (LISTEN)
      apache2  24585  www-data  4u   IPv6  8626153  0t0       TCP   *:80 (LISTEN)
      apache2  24585  www-data  6u   IPv6  8626157  0t0       TCP   *:443 (LISTEN)
      
      

      Other possible states of a TCP connection are CLOSED, SYN-SENT, SYN-RECEIVED,
      ESTABLISHED, CLOSE-WAIT, LAST-ACK, FIN-WAIT-1, FIN-WAIT-2, CLOSING, and TIME-WAIT.

      Finding Information on a Given Protocol

      The next lsof command shows open UDP files that use the NTP (Network Time Protocol) port only:

      sudo lsof -i UDP:ntp
      
        
      COMMAND  PID  USER  FD   TYPE  DEVICE  SIZE/OFF  NODE  NAME
      ntpd     848  ntp   16u  IPv6  22807   0t0       UDP   *:ntp
      ntpd     848  ntp   17u  IPv4  22810   0t0       UDP   *:ntp
      ntpd     848  ntp   18u  IPv4  22814   0t0       UDP   localhost:ntp
      ntpd     848  ntp   19u  IPv4  22816   0t0       UDP   li140-253.members.linode.com:ntp
      ntpd     848  ntp   20u  IPv6  22818   0t0       UDP   localhost:ntp
      ntpd     848  ntp   24u  IPv6  24916   0t0       UDP   [2a01:7e00::f03c:91ff:fe69:1381]:ntp
      ntpd     848  ntp   25u  IPv6  24918   0t0       UDP   [fe80::f03c:91ff:fe69:1381]:ntp
      
      

      The output displays connections that use either IPv4 or IPv6. If you want to display
      the connections that use IPv4 only, you can run the following command:

      sudo lsof -i4 -a -i UDP:ntp
      
        
      COMMAND  PID  USER  FD   TYPE  DEVICE  SIZE/OFF  NODE  NAME
      ntpd     848  ntp   17u  IPv4  22810   0t0       UDP   *:ntp
      ntpd     848  ntp   18u  IPv4  22814   0t0       UDP   localhost:ntp
      ntpd     848  ntp   19u  IPv4  22816   0t0       UDP   li140-253.members.linode.com:ntp
      
      

      Disabling DNS and port Number Resolving

      lsof uses the data found in the /etc/services file to map a port number to a
      service. You can disable this functionality by using the –P option as follows:

      lsof -P -i UDP:ntp -a -i4
      
        
      COMMAND  PID  USER  FD   TYPE  DEVICE  SIZE/OFF  NODE  NAME
      ntpd     848  ntp   17u  IPv4  22810   0t0       UDP   *:123
      ntpd     848  ntp   18u  IPv4  22814   0t0       UDP   localhost:123
      ntpd     848  ntp   19u  IPv4  22816   0t0       UDP   li140-253.members.linode.com:123
      
      

      In a similar way, you can disable DNS resolving using the -n option:

      lsof -P -i UDP:ntp -a -i4 -n
      
        
      COMMAND  PID  USER  FD   TYPE  DEVICE  SIZE/OFF  NODE  NAME
      ntpd     848  ntp   17u  IPv4  22810   0t0       UDP   *:123
      ntpd     848  ntp   18u  IPv4  22814   0t0       UDP   127.0.0.1:123
      ntpd     848  ntp   19u  IPv4  22816   0t0       UDP   109.74.193.253:123
      
      

      The -n option can be particularly useful when you have a problem with your DNS
      servers or when you are interested in the actual IP address.

      Find Network Connections From or To an External Host

      The following command finds all network connections coming from or going to ppp-2-86-23-29.home.example.com:

      sudo lsof -i @ppp-2-86-23-29.home.example.com
      
        
      sshd  22352  root    3u  IPv4 8613370  0t0  TCP  li140-253.members.linode.com:ssh->ppp-2-86-23-29.home.example.com:60032 (ESTABLISHED)
      sshd  22361  mtsouk  3u  IPv4 8613370  0t0  TCP  li140-253.members.linode.com:ssh->ppp-2-86-23-29.home.example.com:60032 (ESTABLISHED)
      
      

      You can also specify the range of ports that interest you as follows:

      sudo lsof -i @ppp-2-86-23-29.home.example.com:200-250
      

      Determine Which Processes are Accessing a Given File

      With lsof you can find the processes that are accessing a given file. For example, by running the lsof command on it’s own file you can determine the processes that are accessing it:

      sudo lsof `which lsof`
      
        
      lsof  25079  root  txt  REG  8,0  163136 5693 /usr/bin/lsof
      lsof  25080  root  txt  REG  8,0  163136 5693 /usr/bin/lsof
      
      

      There are two lines in the above output because the /usr/bin/lsof file is being accessed twice, by
      both which(1) and lsof.

      If you are only interested in the process ID of the processes that are accessing
      a file, you can use the -t option to suppress header lines:

      sudo lsof -t `which lsof`
      
        
      25157
      25158
      
      

      A process ID can commonly be used for easily killing a process using the kill(1) command,
      however this is something that should only be executed with great care.

      List Open Files Under a Given Directory

      The +D lsof command will display all open files under a given directory,
      which in this case is /etc, as well as the name of the process that keeps a
      file or a directory open:

      sudo lsof +D /etc
      
        
      COMMAND    PID  USER   FD   TYPE  DEVICE  SIZE/OFF  NODE    NAME
      avahi-dae  669  avahi  cwd  DIR   8,0     4096      745751  /etc/avahi
      avahi-dae  669  avahi  rtd  DIR   8,0     4096      745751  /etc/avahi
      
      

      List Files that are Opened by a Specific User

      Another option is to locate the files opened by
      any user, including web and database users.

      The following command lists all open files opened by the www-data user:

      sudo lsof -u www-data
      
        
      COMMAND   PID   USER      FD   TYPE  DEVICE  SIZE/OFF  NODE  NAME
      php5-fpm  1066  www-data  cwd  DIR   8,0     4096      2     /
      php5-fpm  1066  www-data  rtd  DIR   8,0     4096      2     /
      
      ...
      
      

      The next variation finds all ESTABLISHED connections owned by the www-data user:

      sudo lsof -u www-data | grep -i ESTABLISHED
      
        
      apache2  24571  www-data  29u  IPv6  8675584  0t0  TCP  li140-253.members.linode.com:https->ppp-2-86-23-29.home.otenet.gr:61383 (ESTABLISHED)
      apache2  24585  www-data  29u  IPv6  8675583  0t0  TCP  li140-253.members.linode.com:https->ppp-2-86-23-29.home.otenet.gr:61381 (ESTABLISHED)
      apache2  27827  www-data  29u  IPv6  8675582  0t0  TCP  li140-253.members.linode.com:https->ppp-2-86-23-29.home.otenet.gr:61382 (ESTABLISHED)
      
      

      Last, the next command will find all processes except the ones owned by www-data by using the ^ character:

      sudo lsof -u ^www-data
      
        
      COMMAND  PID  TID   USER  FD    TYPE  DEVICE  SIZE/OFF  NODE     NAME
      systemd  1          root  cwd   DIR   8,0     4096      2        /
      systemd  1          root  rtd   DIR   8,0     4096      2        /
      systemd  1          root  txt   REG   8,0     1120992   1097764  /lib/systemd/systemd
      
      ...
      
      

      If the user name you are trying to use does not exist, you will get an error message
      similar to the following:

      sudo lsof -u doesNotExist
      
        
      lsof: can't get UID for doesNotExist
      lsof 4.89
       latest revision: ftp://lsof.itap.purdue.edu/pub/tools/unix/lsof/
       latest FAQ: ftp://lsof.itap.purdue.edu/pub/tools/unix/lsof/FAQ
       latest man page: ftp://lsof.itap.purdue.edu/pub/tools/unix/lsof/lsof_man
       usage: [-?abhKlnNoOPRtUvVX] [+|-c c] [+|-d s] [+D D] [+|-E] [+|-e s] [+|-f[gG]]
       [-F [f]] [-g [s]] [-i [i]] [+|-L [l]] [+m [m]] [+|-M] [-o [o]] [-p s]
       [+|-r [t]] [-s [p:s]] [-S [t]] [-T [t]] [-u s] [+|-w] [-x [fl]] [--] [names]
      Use the ``-h'' option to get more help information.
      
      

      Kill All Processes Owned by a User

      The following command will kill all of the processes owned by the www-data user:

      Caution

      Please be careful when combining lsof with the kill(1) command. Do not try to
      test similar commands on a live server unless you are absolutely certain you will not experience issues – for testing purposes you can use a disposable Docker image or something similar.

      sudo kill -9 `lsof -t -u www-data`
      

      Find All Network Activity from a Given User

      The following command lists all network activity by a user named mtsouk:

      lsof -a -u mtsouk -i
      
        
      COMMAND  PID    USER    FD  TYPE  DEVICE   SIZE/OFF  NODE  NAME
      sshd     22361  mtsouk  3u  IPv4  8613370  0t0       TCP   li140-253.members.linode.com:ssh->ppp-2-86-23-29.home.otenet.gr:60032 (ESTABLISHED)
      
      

      On the other hand, the following command lists all network activity from processes not owned by
      the root or the www-data user:

      lsof -a -u ^root -i -u ^www-data
      
        
      avahi-dae  669    avahi   12u  IPv4  20732    0t0  UDP   *:mdns
      avahi-dae  669    avahi   13u  IPv6  20733    0t0  UDP   *:mdns
      avahi-dae  669    avahi   14u  IPv4  20734    0t0  UDP   *:54087
      avahi-dae  669    avahi   15u  IPv6  20735    0t0  UDP   *:48582
      ntpd       848    ntp     16u  IPv6  22807    0t0  UDP  *:ntp
      ntpd       848    ntp     17u  IPv4  22810    0t0  UDP  *:ntp
      ntpd       848    ntp     18u  IPv4  22814    0t0  UDP  localhost:ntp
      ntpd       848    ntp     19u  IPv4  22816    0t0  UDP  li140-253.members.linode.com:ntp
      ntpd       848    ntp     20u  IPv6  22818    0t0  UDP  localhost:ntp
      ntpd       848    ntp     24u  IPv6  24916    0t0  UDP  [2a01:7e00::f03c:91ff:fe69:1381]:ntp
      ntpd       848    ntp     25u  IPv6  24918    0t0  UDP  [fe80::f03c:91ff:fe69:1381]:ntp
      mysqld     1003   mysql   17u  IPv4  24217    0t0  TCP  localhost:mysql (LISTEN)
      sshd       22361  mtsouk  3u   IPv4  8613370  0t0  TCP  li140-253.members.linode.com:ssh->ppp-2-86-23-29.home.otenet.gr:60032 (ESTABLISHED)
      
      

      Find the Total Number of TCP and UDP Connections

      If you process the output of lsof with some traditional UNIX command line tools, like grep and awk,
      you can calculate the total number of TCP and UDP connections:

      sudo lsof -i | awk '{print $8}' | sort | uniq -c | grep 'TCP|UDP'
      
        
           28 TCP
           13 UDP
      
      

      The lsof –i command lists all Internet connections whereas awk extracts the 8th
      field, which is the value of the NODE column and sort sorts the output. Then, the
      uniq –c command counts how many times each line exists. Last, the grep –v 'TCP|UDP'
      command displays the lines that contain the TCP or the UDP word in them.

      Summary

      lsof is a powerful diagnostic tool capable of a significant number of ways that you can combine its command line options to troubleshoot various issues administrators can find themselves facing. As this guide has only provided a few examples of how to use this tool, additional options can be combined for various effects that can be specifically suited to your needs.

      More Information

      You may wish to consult the following resources for additional information on this topic. While these are provided in the hope that they will be useful, please note that we cannot vouch for the accuracy or timeliness of externally hosted materials.

      Find answers, ask questions, and help others.

      This guide is published under a CC BY-ND 4.0 license.



      Source link

      Use Cases for Linode Object Storage


      Updated by Linode

      Contributed by

      Linode

      What is Object Storage?

      Object Storage is a method of storing data that differs in a number of ways from Block Storage. Block Storage splits files into small blocks of data. Minimal file metadata is stored alongside this data and, in general, descriptive metadata must be stored in a separate file or database. In order to use a Block Storage volume it must be attached to a host server, where it acts like a hard drive.

      In contrast, Object Storage stores data, called objects, in containers, called buckets, and each object is given a unique identifier with which it is accessed. In this way, the physical location of the object does not need to be known. The objects are stored alongside rich, configurable metadata that can be used to describe any number of arbitrary properties about the object. Each object has its own URL, so accessing the data is often as simple as issuing an HTTP request, either by visiting the object in a browser or retrieving it through the command line.

      Benefits and Limitations

      Object Storage scales easily because all the objects are stored in a flat, scalable name space. Object Storage does not require a host server in order to be used, meaning many different clients can read from it or write to it.

      With that said, there are limitations to Object Storage. Objects in Object Storage cannot be modified at the block level, as with Block Storage, and must be rewritten in their entirety every time a change is made. This makes any scenario with many successive read/write operations – such as the needs of databases or transactional data – a poor choice for Object Storage. Additionally, Object Storage traffic runs over HTTP, so it does not benefit from the I/O speeds of a mounted Block Storage volume. As a rule of thumb, Object Storage shines when files do not need to be updated frequently.

      Below are some of the more popular use cases for Object Storage.

      Use Cases

      Static Site Hosting

      Because Object Storage buckets provide HTTP access to objects, it’s easy to set up a bucket to serve static websites. A static website is a website that does not require a server-side processing language like PHP to render content. And because a static site does not require each page to be processed with every request, they are usually very quick to load. For more information on setting up a static site on Object Storage, read our Host a Static Site on Linode Object Storage guide. For more on static site generators, visit our How to Choose a Static Site Generator guide.

      Website Files

      If you don’t want to host your entire site on Object Storage (for example: you plan to use a CMS like WordPress), you can still choose to host some of your site’s assets, like images and downloads, with Object Storage. This will save disk space on your server and can help reduce your costs.

      Software Storage and Downloads

      Similar to hosting website files, hosting software applications on Object Storage is a great use case for developers looking to give quick access to their products. Simply upload the file to a bucket and share its URL.

      Unstructured Data

      Unstructured data is any data that does not fit into a traditional database. Object Storage excels at storing large amounts of unstructured data. With the ability to configure custom metadata for each piece of unstructured data, it is easy to extrapolate useful information from each object and to retrieve objects with similar metadata. Examples of unstructured data include images, video, audio, documents, and Big Data.

      Images, Video, Audio, and Documents

      Multimedia assets like images, videos, audio files, and documents are a perfect match for Object Storage. In general these types of files do not change frequently, so there is no need to store them on Block Storage volumes. Because each file has its own URL, streaming the content of these files or embedding them in another program or website is simple and does not require the use of a server.

      Big Data

      Big Data typically describes data sets that are so large and so diverse that it takes specialized tooling to analyze them. In many cases the data that comprises Big Data is considered unstructured and does not fit neatly into a database, making it a great candidate for Object Storage.

      Artifact Storage

      As more and more of the development life cycle becomes automated and tested, more and more artifacts are generated in the process. Object Storage is a great solution for developers looking to store these artifacts, such as the bulk collection of logs. Sharing stored artifacts is as simple as sharing a URL. And if you’d rather your artifacts stay private, you can distribute an access key.

      Cold Storage

      Object Storage is, in the majority of cases, significantly cheaper than Block Storage. While Object Storage can incur a cost when retrieving data, the cost benefit for infrequently accessed data can provide you with an overall cost reduction when compared to similar methods.

      Similarly, Object Storage has benefits over tape storage. Tape storage is frequently used for archival purposes, but the read times that come with tape storage are many times more than what you’ll find with Object Storage. Special considerations have to be made when transferring tape drive data, such as the ability to ship drives safely across long distances. With Object Storage, this data is available through HTTP from anywhere in the world.

      Note

      The outbound data transfer for Linode Object Storage is part of your Linode account’s total transfer pool, which will reduce or completely eliminate transfer costs for Object Storage if you are also running Linode instances. If you expend your allotted transfer pool, you will be billed at a rate of $0.02 per GB for outbound transfers.

      Backups

      Databases and other critical data can be backed up to Object Storage with little effort using a command line client for easier automation. Objects within Object Storage are normally replicated three times, providing resiliency should an error occur with the underlying hardware. Additionally, buckets can be versioned so you never lose access to older backups.

      Private File Storage

      Objects can be made private and only accessible with a key. By default, all new objects in a bucket are set to private, so they are inaccessible by normal HTTP requests (though it’s easy to set public permissions on objects if you’d like). This makes it easy to store secure data.

      Next Steps

      If you’re curious about how to use Object Storage, you can read our guide on How to Use Linode Object Storage for detailed instructions on creating buckets and uploading objects. Read our Host a Static Site using Linode Object Storage to get started with hosting your own site on Object Storage.

      Find answers, ask questions, and help others.

      This guide is published under a CC BY-ND 4.0 license.



      Source link