One place for hosting & domains

      Understanding

      Understanding Suricata Signatures


      Introduction

      The first tutorial in this series explained how to install and configure Suricata. If you followed that tutorial, you also learned how to download and update Suricata rulesets, and how to examine logs for alerts about suspicious activity. However, the rules that you downloaded in that tutorial are numerous, and cover many different protocols, applications, and attack vectors that may not be relevant to your network and servers.

      In this tutorial you’ll learn how Suricata signatures are structured, and some important options that are commonly used in most rules. Once you are familiar with how to understand the structure and fields in a signature, you’ll be able to write your own signatures that you can combine with a firewall to alert you about most suspicious traffic to your servers, without needing to use other external rulesets.

      This approach to writing and managing rules means that you can use Suricata more efficiently, since it only needs to process the specific rules that you write. Once you have a ruleset that describes the majority of the legitimate and suspicious traffic that you expect to encounter in your network, you can start to selectively drop invalid traffic using Suricata in its active Intrusion Prevention (IPS) mode. The next tutorial in this series will explain how to enable Suricata’s IPS functionality.

      Prerequisites

      For the purposes of this tutorial, you can run Suricata on any system, since signatures generally do not require any particular operating system. If you are following this tutorial series, then you should already have:

      Understanding the Structure of Suricata Signatures

      Suricata signatures can appear complex at first, but once you learn how they are structured, and how Suricata processes them, you’ll be able to create your own rules to suit your network’s requirements.

      At a high level, Suricata signatures consist of three parts:

      1. An Action to take when traffic matches the rule.
      2. A Header that describes hosts, IP addresses, ports, protocols, and the direction of traffic (incoming, or outgoing).
      3. Options, which specify things like the Signature ID (sid), log message, regular expressions that match the contents of packets, classification type, and other modifiers that can help narrow identify legitimate and suspicious traffic.

      The general structure of a signature is the following:

      Generic Rule Structure

      ACTION HEADER OPTIONS
      

      The header and options portions of a signature have multiple sections. For example, in the previous tutorial, you tested Suricata using the rule with sid 2100498. Here is the complete rule for reference:

      sid:2100498

      alert ip any any -> any any (msg:"GPL ATTACK_RESPONSE id check returned root"; content:"uid=0|28|root|29|"; classtype:bad-unknown; sid:2100498; rev:7; metadata:created_at 2010_09_23, updated_at 2010_09_23;)
      

      The alert portion of the signature is the action, the ip any any -> any any section is the header, and the rest of the signature starting with (msg:GPL ATTACK_RESPONSE... contains the rule’s options.

      In the following sections you’ll examine each part of a Suricata rule in detail.

      Actions

      The first part of the sid:2100498 signature is the action, in this case alert. The action portion of a Suricata signature specifies the action to take when a packet matches the rule. An action can be one of the following depending on whether Suricata is operating in IDS or IPS mode:

      • Pass – Suricata will stop scanning the packet and allow it, without generating an alert.
      • Drop – When working in IPS mode, Suricata will immediately stop processing the packet and generate an alert. If the connection that generated the packet uses TCP it will time out.
      • Reject – When Suricata is running IPS mode, a TCP reset packet will be sent, and Suricata will drop the matching packet.
      • Alert – Suricata will generate an alert and log it for further analysis.

      Each Suricata signature has a header section that describes the network protocol, source and destination IP addresses, ports, and direction of traffic. Referring to the example sid:2100498 signature, the header section of the rule is the highlighted ip any any -> any any portion:

      sid:2100498

      alert ip any any -> any any (msg:"GPL ATTACK_RESPONSE id check returned root"; content:"uid=0|28|root|29|"; classtype:bad-unknown; sid:2100498; rev:7; metadata:created_at 2010_09_23, updated_at 2010_09_23;)
      

      The general format of a rule’s header section is:

      Rule Format

      <PROTOCOL> <SOURCE IP> <SOURCE PORT> -> <DESTINATION IP> <DESTINATION PORT>
      

      The Protocol can be one of the following:

      • TCP
      • UDP
      • ICMP
      • IP
      • A number of other application protocols

      The Source and Destination fields can be IP addresses or network ranges, or the special value any, which will match all IP addresses and networks. The -> arrow indicates the direction of traffic.

      Note: Signatures can also use a non-directional marker <> that will match traffic in both directions. However, the Suricata documentation about directional markers notes that most rules will use the -> right matching arrow.

      If you wanted to alert on malicious outbound traffic (that is traffic leaving your network), then the Source field would be the IP address or network range of your system. The Destination could be a remote system’s IP or network, or the special any value.

      Conversely, if you wanted to generate an alert for malicious incoming traffic, the Source field could be set to any, and the Destination to your system’s IP address or network range.

      You can also specify the TCP or UDP port to examine using the Port fields. Generally, traffic originating from a system is assigned a random port, so the any value is appropriate for the left side of the -> indicator. The destination port can also be any if you plan to examine the contents of every incoming packet, or you can limit a signature to only scan packets on individual ports, like 22 for SSH traffic, or 443 for HTTPS.

      The ip any any -> any any header from sid:2100498 is a generic header that will match all traffic, regardless of protocol, source or destination IPs, or ports. This kind of catch all header is useful when you want to ensure inbound and outbound traffic is checked for suspicious content.

      Note that the Source, Destination, and Port fields can also use the special ! negation operator, which will process traffic that does not match the value of the field.

      For example, the following signature would make Suricata alert on all incoming SSH packets from any network that are destined for your network (represented by the 203.0.113.0/24 IP block), that are not destined for port 22:

      Example Header

      alert ssh any any -> 203.0.113.0/24 !22 (sid:1000000;)
      

      This alert would not be that useful, since it does not contain any message about the packet, or a classification type. To add extra information to an alert, as well as match on more specific criteria, Suricata rules have an Options section where you can specify a number of additional settings for a signature.

      Options

      The arguments inside the parenthesis (. . .) in a Suricata signature contain various options and keyword modifiers that you can use to match on specific parts of a packet, classify a rule, or log custom messages. Whereas a rule’s header arguments operate on packet headers at the IP, port, and protocol level, options match on the data contained inside a packet.

      Options in a Suricata rule must be separated by a ; semicolon, and generally use a key:value format. Some options do not have any settings and only the name needs to be specified in a rule.

      Using the example signature from the previous section, you could add the msg option with a value of SSH traffic detected on non-SSH port explaining what the alert is about:

      Example Header

      alert ssh any any -> 203.0.113.0/24 !22 (msg:"SSH TRAFFIC on non-SSH port"; sid:1000000;)
      

      A full explanation of how you can use each option in a Suricata rule is beyond the scope of this tutorial. The Suricata rules documentation beginning in Section 6.2 describes each keyword option in detail.

      However, there are some core options like the content keyword and various Meta keywords that are used in most signatures, which we’ll examine in the following sections.

      The Content Keyword

      One of the most important options for any rule is the content keyword. Recall the example sid:2100498 signature:

      sid:2100498

      alert ip any any -> any any (msg:"GPL ATTACK_RESPONSE id check returned root"; content:"uid=0|28|root|29|"; classtype:bad-unknown; sid:2100498; rev:7; metadata:created_at 2010_09_23, updated_at 2010_09_23;)
      

      The highlighted content:"uid=0|28|root|29|"; portion contains the content keyword, and the value that Suricata will look for inside a packet. In the case of this example signature, all packets from any IP address on any port will be checked to ensure they do not contain the string value uid=0|28|root|29| (which in the previous tutorial was used as an example indicating a compromised host).

      The content keyword can be used with most other keywords in Suricata. You can create very specific signatures using combinations of headers, and options that target specific application protocols, and then check packet contents for individual bytes, strings, or matches using regular expressions.

      For example, the following signature examines DNS traffic looking for any packet with the contents your_domain.com and generates an alert:

      dns.query Example

      alert dns any any -> any any (msg:"DNS LOOKUP for your_domain.com"; dns.query; content:"your_domain.com"; sid:1000001;)
      

      However, this rule would not match if the DNS query used the domain YOUR_DOMAIN.COM, since Suricata defaults to case-sensitive content matching. To make content matches insensitive to case, add the nocase; keyword to the rule:

      Case-insensitive dns.query Example

      alert dns any any -> any any (msg:"DNS LOOKUP for your_domain.com"; dns.query; content:"your_domain.com"; nocase; sid:1000001;)
      

      Now any combination of lower or uppercase letters will still match the content keyword.

      The msg Keyword

      The example signatures in this tutorial have all contained msg keywords with information about a signature. While the msg option is not required, leaving it blank makes it difficult to understand why an alert or drop action has occurred when examining Suricata’s logs.

      A msg option is designed to be a human-readable text description of an alert. It should be descriptive and add context to an alert so that you or someone else who is analyzing logs understand why the alert was triggered. In the [reference Keyword](reference Keyword) section of this tutorial you will learn about the reference option that you can use to link to more information about a signature and the issue it is designed to detect.

      The sid and rev Keywords

      Every Suricata signature needs a unique Signature ID (sid). If two rules have the same sid (in the following example output it is sid:10000000), Suricata will not start and will instead generate an error like the following:

      Example Duplicate sid Error

      . . . 19/11/2021 -- 01:17:40 - <Error> - [ERRCODE: SC_ERR_DUPLICATE_SIG(176)] - Duplicate signature "drop ssh any any -> 127.0.0.0/8 !22 (msg:"blocked invalid ssh"; sid:10000000;)" . . .

      When you create your own signatures, the range 1000000-1999999 is reserved for custom rules. Suricata’s built-in rules are in the range from 2200000-2299999. Other sid ranges are documented on the Emerging Threats SID Allocation page.

      The sid option is usually the last part of a Suricata rule. However, if there have been multiple versions of a signature with changes over time, there is a rev option that is used to specify the version of a rule. For example, the SSH alert from earlier in this tutorial could be changed to only scan for SSH traffic on port 2022:

      Example SSH Signature with rev

      alert ssh any any -> 203.0.113.0/24 2022 (msg:"SSH TRAFFIC on non-SSH port"; sid:1000000; rev:2;)
      

      The updated signature now includes the rev:2 option, indicating it has been updated from a previous version.

      The reference Keyword

      The reference keyword is used in signatures to describe where to find more information about the attack or issue that a rule is meant to detect. For example, if a signature is designed to detect a new kind of exploit or attack method, the reference field can be used to link to a security researcher or company’s website that documents the issue.

      The Heartbleed vulnerability in OpenSSL is an example of a widely publicized and researched bug. Suricata comes with signature that is designed to check for incorrect TLS packets and includes a reference to the main Heartbleed CVE entry :

      /etc/suricata/rules/tls-events.rules

      alert tls any any -> any any (msg:"SURICATA TLS invalid heartbeat encountered, possible exploit attempt (heartbleed)"; flow:established; app-layer-event:tls.invalid_heartbeat_message; flowint:tls.anomaly.count,+,1; classtype:protocol-command-decode; reference:cve,2014-0160; sid:2230013; rev:1;)
      

      Note the highlighted reference:cve,2014-0160; portion of the signature. This reference option tells you or the analyst who is examining alerts from Suricata where to find more information about the particular issue.

      The reference option can use any of the prefixes from the /etc/suricata/reference.config file. For example, url could be used in place of cve in the preceding example, with a link directly to the Heartbleed site in place of the 2014-0160 CVE identifier.

      The classtype Keyword

      Suricata can classify traffic according to a preconfigured set of categories that are included when you install the Suricata package with your Linux distribution’s package manager. The default classification file is usually found in /etc/suricata/classification.config and contains entries like the following:

      /etc/suricata/classification.config

      #
      # config classification:shortname,short description,priority
      #
      
      config classification: not-suspicious,Not Suspicious Traffic,3
      config classification: unknown,Unknown Traffic,3
      config classification: bad-unknown,Potentially Bad Traffic, 2
      . . .
      

      As indicated by the file header, each classification entry has three fields:

      • A short, machine readable name, in the above examples not-suspicious, unknown, and bad-unknown respectively.
      • The description for a classification to be used with alerts, for example Not Suspicious Traffic.
      • A priority field, which determines the order in which a signature will be processed by Suricata. The highest priority is the value 1. Signatures that use a classifier with a higher priority will get checked first when Suricata processes a packet.

      In the example sid:2100498 signature, the classtype is classtype:bad-unknown;, which is highlighted in the following example:

      sid:2100498

      alert ip any any -> any any (msg:"GPL ATTACK_RESPONSE id check returned root"; content:"uid=0|28|root|29|"; classtype:bad-unknown; sid:2100498; rev:7; metadata:created_at 2010_09_23, updated_at 2010_09_23;)
      

      The implicit priority for the signature is 2, since that is the value that is assigned to the bad-unknown classtype in /etc/suricata/classification.config. If you would like to override the default priority for a classtype, you can add a priority:n option to a signature, where n is a value from 1 to 255.

      The target Keyword

      Another useful option in Suricata signatures is the target option. It can be set to one of two values: src_ip and dest_ip. The purpose of this option is to correctly identify the source and target hosts in Suricata’s alert logs.

      For example, the SSH signature from earlier in this tutorial can be enhanced with the target:dest_ip; option:

      Example SSH Signature with target field

      alert ssh any any -> 203.0.113.0/24 2022 (msg:"SSH TRAFFIC on non-SSH port"; target:dest_ip; sid:1000000; rev:3;)
      

      This example uses dest_ip because the rule is designed to check for SSH traffic coming into our example network, so it is the destination. Adding the target oiption to a rule will result in the following extra fields in the alert portion of an eve.json log entry.

      . . .
        "source": {
          "ip": "127.0.0.1",
          "port": 35272
        },
        "target": {
          "ip": "203.0.113.1",
          "port": 2022
        }
      . . .
      

      With these entries in Suricata’s logs, they can be sent to a Security Information and Event Management (SIEM) tool to make it easier to search for alerts that might be originating from a common host, or attacks that are directed to a specific target on your network.

      Conclusion

      In this tutorial you examined each of the main sections that make a complete Suricata signature. Each of the Actions, Header, and Options sections in a rule have multiple options and support scanning packets using many different protocols. While this tutorial did not explore any of the sections in great depth, the structure of rule, and the important fields in the examples should be enough to get started writing your own rules.

      If you want to explore complete signatures that include many more options than the ones described in this tutorial, explore the files in the /etc/suricata/rules directory. If there is a field in a rule that you would like to know more about, the Suricata Rules Documentation is the authoritative resource on what each option and its possible values mean.

      Once you are comfortable reading and testing signatures, you can proceed to the next tutorial in this series. In it you will learn how to enable Suricata’s IPS mode, which is used to drop suspicious traffic as opposed to the default IDS mode that only generates alerts.



      Source link

      Understanding Open-Source Software Licenses


      Introduction

      A software license is a legal agreement that defines how a given piece of software can be used. For software developers who may want to exercise certain rights, permissions, and control over how the work is used, modified, and shared by others, choosing a software license is an important decision. Some developers may want to place strong restrictions over how their software can be used. Others, however, may choose to license their software with few or no restrictions. This may be because they want their software to be as widely used as possible, or perhaps they oppose restrictive software licenses on philosophical grounds.

      Regardless of their reasoning, developers can accomplish this by implementing an open-source software license. Broadly speaking, open-source software licenses make the source code available for use, modification, and distribution based on agreed-upon terms and conditions. There are many different open-source software licenses, and they vary based on the restrictions a creator may want future users to abide by.

      When it comes to long-term planning for your project, it’s useful to understand the open-source software licenses available so that you can make an informed decision about which one best suits your project’s needs. In this article, we will share information about rights you have when your work is created (such as copyright), and how licensing helps establish the legal agreement you want your users to abide by when using your software. We will also discuss the differences between proprietary, free, and open-source software, permissive and copyleft licenses, and information about the open-source software license options suggested when creating a GitHub project.

      Note: This article is not intended to provide any form of legal advice, it’s solely a resource of information on the topic of open-source software licensing.

      If you’d like to learn more about patents, trademarks, and intellectual property, you can visit the U.S. Patent and Trademark Office.

      In the U.S. and many countries, there are certain legal protections you are automatically granted for any creative work you produce, one of those being copyright. The U.S. Copyright Office defines copyright as “a type of intellectual property that protects original works of authorship,” specifically when the “author fixes the work in a tangible form of expression.” This means with copyright you are not the owner of the idea, but rather the material expression of the idea. If a copyright owner desires stricter legal protection over their work, this can be achieved through patents, trademarks, and intellectual property laws. Copyrighting your work does not require a formal process to ensure these rights are given.

      Copyright grants the owner various rights, such as reproducing and distributing copies of the work. If an owner wants control over how their work can be used by others, then they must implement a license that outlines the rules by which those users must abide. If the copyright owner states the work is “All Rights Reserved”, this means that their work cannot be used or modified by anyone at all, except themselves.

      Another complexity to acknowledge is the creative work you produce for your employer. If you’re engaging in what is known as work for hire, this means that any work you create for the company or organization you work for belongs to that entity, since they’re paying you for the work. As a result, sharing this work without permission has legal consequences since you do not have ownership rights to copyright or licensing.

      Proprietary Software, Free Software, and Open-Source Software

      Proprietary software is any software with a license that restricts how it can be used, modified, or shared. Video games are a common example of proprietary software. If you purchase a video game (whether as a cartridge, disc, or digital download), you aren’t allowed to make a copy of that game to share with friends or sell for profit. It’s also likely you aren’t permitted to modify the game’s code to run it on a different platform than the one you originally bought it for.

      Software users are typically held to certain restrictions with an end-user license agreement (EULA). If you’ve ever purchased software, you may have assumed you own that piece of software. However, if you’ve purchased proprietary software, it will likely come with a EULA that specifies you do not own the software. Instead, you’re the owner of a software license that permits you to use that software. EULAs may also define how you can use the license itself, and they typically limit you from sharing it with others without the permission of the software owner (the software’s developer or publisher).

      Another legal instrument similar to a EULA is a Terms of Service agreement (ToS). Sometimes known as Terms of Use or Terms and Conditions, a ToS outlines the rules a user must follow in order to be allowed to use a program or service. It’s more common to see an EULA included with software that requires a one-time purchase, while ToS agreements are more common for subscription services and websites. Oftentimes, the first time you start a given piece of proprietary software, a dialog box will appear which explains the EULA or ToS and contains an I Agree button (or something similar) which you must click before you can use the program.

      Software with such restrictions hasn’t always been the norm. Before the 1970s, software was typically distributed along with its source code, meaning users were free to modify and share the software as they desired. With time, though, software publishers began imposing restrictions on these activities, typically with the goal of increasing profits by reducing the number of people who used their software but didn’t pay for it.

      This development had repercussions in the form of two closely related movements: the free software and the open-source software movements. Although the two are distinct, the free software and open-source software movements both argue that software users should be allowed to access a program’s source code, modify it as they see fit, and share it as often and with whomever they like.

      Note: Since free software is generally considered to be open source, but open-source software is not always considered to be free, this guide will default to the more inclusive terms “open-source software” and “open-source software licenses” moving forward. However, please be aware that the two terms are not always interchangeable.

      If you’d like a more thorough explanation of the history and differences between free software and open-source software, we encourage you to read our article on The Difference Between Free and Open-Source Software.

      Open-source software advocates still encourage developers to distribute their software with a license. However, instead of a proprietary software license outlining what users may not do, they recommend using an open-source software license that outlines the freedoms available to users of the given piece of software. These licenses are often distributed as a single file within the program, typically named LICENSE.txt or a similar naming convention.

      Over the years, there has been some disagreement about what specific freedoms should be guaranteed by an open-source software license. This has led to the emergence of many different open-source licenses, but most of these can fall into one of two categories: permissive and copyleft licenses.

      Permissive and Copyleft Open-Source Software Licenses

      A permissive license, sometimes referred to as a non-copyleft license, grants users permission to use, modify, and share the source code, but users also have the option to change some of those terms and conditions for redistribution, including derivative work. In the context of software, a derivative work is a piece of software that is based on an existing program. If the original was released under a permissive license, a creator can choose to share their derivative work with different terms than what the original work’s license might have required.

      A copyleft license, also grants users permission to use, modify, and share the source code, but offers protection against relicensing through specific restrictions and terms and conditions. This means that software users creating derivative work are required to release under the same copyleft license terms and conditions of the original work. This reciprocity is a defining aspect of copyleft licenses, and is intended to protect creators’ intentions by ensuring that users will have the same rights and permissions when using works derived from the original software.

      In addition, there are public-domain-equivalent licenses that grant users permission to use copyrighted works without attribution or required licensing compatibility. For a creator, this means that any rights over their work are completely forfeited. Although there is some overlap in the philosophies behind public-domain and free and open-source software licenses, there has been disagreement over the years about whether a public-domain-equivalent license truly qualifies as open source. In 2012, the CC0 license was submitted but ultimately denied approval by the Open Source Initiative (OSI), a nonprofit organization that defines standards for open-source software and maintains a list of approved open source licenses. However, the OSI did approve a public-domain-equivalent license called the Unlicense in 2020.

      Why Include an Open-Source Software License?

      As a developer starting a project from scratch, it’s important to have some familiarity with the open-source software licenses available to assess how you’d like others to use your work. Recognizing these licenses is also important to users so they can understand the permissions or restrictions set by the agreement they’ve made when using the creator’s work.

      Again, any original work will have copyright upon completion, but without a license, it’s unclear what is and isn’t allowed for those who want to use it. Consider the following reasons why you might include an open-source software license:

      • Improvement: The open-source community prides itself on cultivating a culture that encourages collaboration and innovation. Using an open-source software license invites users to engage in community development. This creates a shared sense of responsibility to consistently improve the source code or expand the program further to everyone’s benefit.

      • Ownership: If you want to exercise more power over your work, choosing a license that can place those restrictions will help you do so. For instance, if you want any derivative works to grant the same permissions as the one you originally chose, you may want to opt for a copyleft license. Fortunately, an open-source software license provides transparency to future users on how much control you have over the work, whether it’s a lot or a little, is up to you.

      • Competition: There’s a plethora of software out there and if you want to break into that market, using an open-source license can help put you on the map. Some popular examples of open-source software that were developed to compete with established proprietary alternatives include the Linux operating system, Android by Google, and the Firefox browser.

      Keep in mind that it is possible to monetize an open-source software project, but the typical business practice for monetizing software is to use a proprietary license to protect the software from being shared or stolen.

      These reasons for using an open-source software license may not all be applicable to you, and we encourage you to do your own research on the subject before choosing a license for your next project. Additionally, you may want to seek the assistance of a legal professional to confirm a full understanding of what a license would signify for your work in the present and future.

      As mentioned earlier, this article focuses on the open-source software licenses listed when creating a new repository for your project on GitHub. You’ll notice at the end of the page there is an option for choosing a license. Once you click the box, a drop-down list of licenses will appear for you to select from, like in the following:

      List of open-source software licenses on GitHub

      In the next sections, we will provide brief descriptions of the types of open-source software licenses you can choose from for your next project, starting with the permissive licenses recommended by GitHub.

      Permissive Open-Source Software Licenses

      Permissive licenses grant software users permission to use, modify, and share the source code. Additionally, creators of software derived from permissively licensed software can change the licensing conditions for redistribution.

      Please note, the following list is not representative of all the permissive open-source software licenses available. Rather, this list is taken from the license options offered by GitHub when starting a new project. Also, these brief descriptions are not comprehensive. We recommend carefully reading through the documentation for any license you’re interested in using or speaking with a legal professional for more information.

      Apache License

      The Apache License is written by the Apache Software Foundation (ASF). With this license, users do not have to share their modified version of the source code under the same license and can choose to use a different one, this is known as sublicensing.

      MIT License

      The MIT License is from the Massachusetts Institute of Technology (MIT) and is one of the shortest to read with few restrictions. Similar to the Apache license, it also gives users the option to sublicense the software.

      BSD Licenses

      GitHub lets you choose between two BSD licenses, the BSD 2-Clause “Simplified” License, sometimes referred to as the “FreeBSD” license; and the BSD 3-Clause “New” or “Revised” License. The main difference between these two licenses is with the 3-clause. This clause restricts software users from using the name of the author, authors, or contributors, to endorse products or services.

      Boost Software License

      The Boost Software License, is from the Boost Libraries of C++ and was approved by the OSI in 2008. This license is similar to the MIT and BSD licenses, except it does not require attribution when redistributing in binary form.

      Copyleft Open-Source Software Licenses

      Copyleft licenses grant software users permission to use, modify, and share the source code, but also protect against relicensing through specific restrictions and terms and conditions. This represents the reciprocal characteristic of this license that requires users’ work to adhere to the original rights outlined in the license.

      Again, the following list is not representative of all the copyleft open-source software licenses available. Rather, this list is taken from the license options offered by GitHub when starting a new project. Also, these brief descriptions are not comprehensive. We recommend carefully reading through the documentation for any license you’re interested in using or speaking with a legal professional for more information.

      GNU Licenses

      There have been a number of versions of the GNU General Public License (GPL) that have been released by the Free Software Foundation, four of which users can choose from on GitHub. The GPL v3.0 requires users to state any modifications to the original code and make that original code available when distributing any binaries used on their work under that licensed software. This license also made it easier to work with other licenses such as Apache, which the previous version (v2.0), did not have compatibility with.

      Before the current GPL v3.0 version, a second version was created, the GNU Public License v2.0. This license shares similar terms and conditions as v3.0, but is considered a strong copyleft license. A strong copyleft license requires that any modifications to the source code get released using the same license. The primary difference with v2.0 is that software users are allowed to distribute work if they adhere to the requirements of the license, regardless of prior legal obligations. The goal of this clause is to prevent an individual or party from submitting a patent infringement claim that would limit a user’s freedom under this license.

      There is also the GNU Lesser General Public License , referred to as LGPL, and also v2.1 of the GPL v2.0. This license is meant to serve as a middle-ground between strong and weak copyleft licenses. The main difference with this license is that software users can combine a software component of the LGPL with their own and are not required to share the source code of their own components. Users can also distribute a hybrid library, which is a combination of functions in the LGPL library and functions from a non-LGPL, but there must be a copy of that non-LGPL library and information on where it’s located.

      Another GNU license is the GNU Affero General Public License v3.0, referred to as AGPL. The main difference with this license is that it is specific to software programs used on a server. This license requires users who run a modified program on a server to share this information and make the modified source code available for download to the relevant modified version that is currently running on the server.

      Eclipse Public License

      The Eclipse Public License, is from the Eclipse Foundation and is considered a weak copyleft license. A weak copyleft license requires software users to share any changes they make to the code. This license chose to implement a weaker copyleft as a way to reduce the stricter requirements users encountered with GNU’s General Public Licenses.

      Mozilla Public License

      The Mozilla Public License, or MPL, is from the Mozilla Foundation and is also considered a weak copyleft license. The difference with this license (in comparison with the Eclipse Public License) is that it is file-based copyleft, which means code can be combined with open-source or proprietary code.

      Public-Domain-Equivalent Licenses

      Public-domain-equivalent licenses grant users permission to use copyrighted works without attribution or required licensing compatibility. As you may recall, these licenses are not always OSI-approved.

      Creative Commons Zero Universal License

      The Creative Commons Zero Universal License, was written by Creative Commons and is considered a public copyright license. This means copyrighted work can be freely distributed. Please be aware that this license is not OSI-approved. The main point about this license is that users can use, distribute, and modify the source code, but must agree to waive any copyrights to ensure this work is accessible in the public domain. Additionally, users do not have to provide any attribution to the work and can use it commercially.

      The Unlicense

      The Unlicense was released in 2012 and is considered a public-domain-equivalent license that is OSI-approved. With this license, software users can use, modify, distribute source code, and compiled binary for both commercial and non-commercial purposes. This license also advises users who want to ensure contributions to the code or software are available to the public domain by including a statement about their commitment to sharing the code base with the public.

      Conclusion

      There are many factors to consider when choosing an open-source software license. Yet, there are certainly popular choices among the developer community. Common permissive licenses include the MIT License, Apache License, and BSD License. Some common copyleft licenses include the GNU General Public License and the Mozilla Public License.

      Remember, this article only provided information about a few common open-source software licenses, specifically the ones suggested by GitHub. We encourage you to explore all of your available licensing options or consult the help of a legal professional to make an informed decision about what best fits the needs of your project.



      Source link

      Understanding Data Types in PHP


      The author selected Open Sourcing Mental Illness Ltd to receive a donation as part of the Write for DOnations program.

      Introduction

      In PHP, as in all programming languages, data types are used to classify one particular type of data. This is important because the specific data type you use will determine what values you can assign to it and what you can do to it (including what operations you can perform on it).

      In this tutorial, we will go over the important data types native to PHP. This is not an exhaustive investigation of data types, but will help you become familiar with what options you have available to you in PHP.

      One way to think about data types is to consider the different types of data that we use in the real world. Two different types are numbers and words. These two data types work in different ways. We would add 3 + 4 to get 7, while we would combine the words star and fish to get starfish.

      If we start evaluating different data types with one another, such as numbers and words, things start to make less sense. The following equation, for example, has no obvious answer:

      'sky' + 8

      For computers, each data type can be thought of as being quite different, like words and numbers, so we have to be careful about how we use them to assign values and how we manipulate them through operations.

      Working with Data Types

      PHP is a loosely typed language. This means, by default, if a value doesn’t match the expected data type, PHP will attempt the change the value of the wrong data type to match the expected type when possible. This is called type juggling. For example, a function that expects a string but instead receives an integer with a value of 2 will change the incoming value into the expected string type with a value of "2".

      It is possible, and encouraged, to enable strict mode on a per-file basis. This provides enforcement of data types in the code you control, while allowing the use of additional code packages that may not adhere to strict data types. Strict type is declared at the top of a file:

      <?php
      declare(strict_types=1);
      ...
      

      In strict mode, only a value corresponding exactly to the type declaration will be accepted; otherwise a TypeError will be thrown. The only exception to this rule is that an int value will pass a float type declaration.

      Numbers

      Any number you enter in PHP will be interpreted as a number. You are not required to declare what kind of data type you are entering. PHP will consider any number written without decimals as an integer (such as 138) and any number written with decimals as a float (such as 138.0).

      Integers

      Like in math, integers in computer programming are whole numbers that can be positive, negative, or 0 (…, -1, 0, 1, …). An integer can also be known as an int. As with other programming languages, you should not use commas in numbers of four digits or more, so to represent the number 1,000 in your program, write it as 1000.

      We can print out an integer in a like this:

      echo -25;
      

      Which would output:

      Output

      -25

      We can also declare a variable, which in this case is a symbol of the number we are using or manipulating, like so:

      $my_int = -25;
      echo $my_int;
      

      Which would output:

      Output

      -25

      We can do math with integers in PHP, too:

      $int_ans = 116 - 68;
      echo $int_ans;
      

      Which would output:

      Output

      48

      Integers can be used in many ways within PHP programs, and as you continue to learn more about the language you will have a lot of opportunities to work with integers and understand more about this data type.

      Floating-Point Numbers

      A floating-point number or float is a real number, meaning that it can be either a rational or an irrational number. Because of this, floating-point numbers can be numbers that can contain a fractional part, such as 9.0 or -116.42. For the purposes of thinking of a float in a PHP program, it is a number that contains a decimal point.

      Like we did with the integer, we can print out a floating-point number like this:

      echo 17.3;
      

      Which would output:

      Output

      17.3

      We can also declare a variable that stands in for a float, like so:

      $my_flt = 17.3;
      echo $my_flt;
      

      Which would output:

      Output

      17.3

      And, just like with integers, we can do math with floats in PHP, too:

      $flt_ans = 564.0 + 365.24;
      echo $flt_ans;
      

      Which would output:

      Output

      929.24

      With integers and floating-point numbers, it is important to keep in mind that 3 does not equal 3.0, because 3 refers to an integer while 3.0 refers to a float. This may or may not change the way your program functions.

      Numbers are useful when working with calculations, counting items or money, and the passage of time.

      Strings

      A string is a sequence of one or more characters that may consist of letters, numbers, or symbols. This sequence is enclosed within either single quotes '' or double quotes "":

      echo 'This is a 47 character string in single quotes.'
      echo "This is a 47 character string in double quotes."
      

      Both lines output the their value the same way:

      Output

      This is a 47 character string in single quotes. This is a 47 character string in double quotes.

      You can choose to use either single quotes or double quotes, but whichever you decide on you should be consistent within a program.

      The program “Hello, World!” demonstrates how a string can be used in computer programming, as the characters that make up the phrase Hello, World! are a string:

      echo "Hello, World!";
      

      As with other data types, we can store strings in variables and output the results:

      $hw = "Hello, World!"
      echo $hw;
      

      Either way, the output is the same:

      Output

      Hello, World!

      Like numbers, there are many operations that we can perform on strings within our programs in order to manipulate them to achieve the results we are seeking. Strings are important for communicating information to the user, and for the user to communicate information back to the program.

      Booleans

      The Boolean, or bool, data type can be one of two values, either true or false. Booleans are used to represent the truth values that are associated with the logic branch of mathematics.

      You do not use quotes when declaring a Boolean value; anything in quotes is assumed to be a string. PHP doesn’t care about case when declaring a Boolean; True, TRUE, true, and tRuE all evaluate the same. If you follow the style guide put out by the PHP-FIG, the values should be all lowercase true or false.

      Many operations in math give us answers that evaluate to either True or False:

      • greater than
        • 500 > 100 True
        • 1 > 5 False
      • less than
        • 200 < 400 True
        • 4 < 2 False
      • equal
        • 5 = 5 True
        • 500 = 400 False

      Like with any other data type, we can store a Boolean value in a variable. Unlike numbers or strings, echo cannot be used to output the value because a Boolean true value is converted to the string "1", while a Boolean false is converted to "" (an empty string). This allows “type juggling” to convert a variable back and forth between Boolean and string values. To output the value of a Boolean we have several options. To output the type along with the value of a variable, we use var_dump. To output the string representation of a variable’s value, we use var_export:

      $my_bool = 4 > 3;
      echo $my_bool;
      var_dump($my_bool);
      var_export($my_bool);
      

      Since 4 is greater than 3, we will receive the following output:

      Output

      1 bool(true) true

      The echo line converts the true Boolean to the string of 1. The var_dump outputs the variable type of bool along with the value of true. The var_export outputs the string representation of the value which is true.

      As you write more programs in PHP, you will become more familiar with how Booleans work and how different functions and operations evaluating to either True or False can change the course of the program.

      Arrays

      An array in PHP is actually an ordered map. A map is a data type that associates or “maps” values to keys. This data type has many different uses; it can be treated as an array, list, hash table, dictionary, collection, and more. Additionally, because array values in PHP can also be other arrays, multidimensional arrays are possible.

      Indexed Arrays

      In its simplest form, an array will have a numeric index or key. If you do not specify a key, PHP will automatically generate the next numeric key for you. By default, array keys are 0-indexed, which means that the first key is 0, not 1. Each element, or value, that is inside of an array can also be referred to as an item.

      An array can be defined in one of two ways. The first is using the array() language construct, which uses a comma-separated list of items. An array of integers would be defined like this:

      array(-3, -2, -1, 0, 1, 2, 3)
      

      The second and more common way to define an array is through the short array syntax using square brackets []. An array of floats would be defined like this:

      [3.14, 9.23, 111.11, 312.12, 1.05]
      

      We can also define an array of strings, and assign an array to a variable, like so:

      $sea_creatures = ['shark', 'cuttlefish', 'squid', 'mantis shrimp'];
      

      Once again, we cannot use echo to output an entire array, but we can use var_export or var_dump:

      var_export($sea_creatures);
      var_dump($sea_creatures);
      

      The output shows that the array uses numeric keys:

      Output

      array ( 0 => 'shark', 1 => 'cuttlefish', 2 => 'squid', 3 => 'mantis shrimp', ) array(4) { [0]=> string(5) "shark" [1]=> string(10) "cuttlefish" [2]=> string(5) "squid" [3]=> string(13) "mantis shrimp" }

      Because the array is 0-indexed, the var_dump shows an indexed array with numeric keys between 0 and 3. Each numeric key corresponds with a string value. The first element has a key of 0 and a value of shark. The var_dump function gives us more details about an array: there are 4 items in the array, and the value of the first item is a string with a length of 5.

      The numeric key of an indexed array may be specified when setting the value. However, the key is more commonly specified when using a named key.

      Associative Arrays

      Associative arrays are arrays with named keys. They are typically used to hold data that are related, such as the information contained in an ID. An associative array looks like this:

      ['name' => 'Sammy', 'animal' => 'shark', 'color' => 'blue', 'location' => 'ocean']
      

      Notice the double arrow operator => used to separate the strings. The words to the left of the => are the keys. The key can either be an integer or a string. The keys in the previous array are: 'name', 'animal', 'color', 'location'.

      The words to the right of the => are the values. Values can be comprised of any data type, including another array. The values in the previous array are: 'Sammy', 'shark', 'blue', 'ocean'.

      Like the indexed array, let’s store the associative array inside a variable, and output the details:

      $sammy = ['name' => 'Sammy', 'animal' => 'shark', 'color' => 'blue', 'location' => 'ocean'];
      var_dump($sammy);
      

      The results will describe this array as having 4 elements. The string for each key is given, but only the value specifies the type string with a character count:

      Output

      array(4) { ["name"]=> string(5) "Sammy" ["animal"]=> string(5) "shark" ["color"]=> string(4) "blue" ["location"]=> string(5) "ocean" }

      Associative arrays allow us to more precisely access a single element. If we want to isolate Sammy’s color, we can do so by adding square brackets containing the name of the key after the array variable:

      echo $sammy['color'];
      

      The resulting output:

      Output

      blue

      As arrays offer key-value mapping for storing data, they can be important elements in your PHP program.

      Constants

      While a constant is not actually a separate data type, it does work differently than other data types. As the name implies, constants are variables which are declared once, after which they do not change throughout your application. The name of a constant should always be uppercase and does not start with a dollar sign. A constant can be declared using either the define function or the const keyword:

      define('MIN_VALUE', 1);
      const MAX_VALUE = 10;
      

      The define function takes two parameters: the first is a string containing the name of the constant, and the second is the value to assign. This could be any of the data type values explained earlier. The const keyword allows the constant to be assigned a value in the same manner as other data types, using the single equal sign. A constant can be used within your application in the same way as other variables, except they will not be interpreted within a double quoted string:

      echo "The value must be between MIN_VALUE and MAX_VALUE";
      echo "The value must be between ".MIN_VALUE." and ".MAX_VALUE;
      

      Because the constants are not interpreted, the output of these lines is different:

      Output

      The value must be between MIN_VALUE and MAX_VALUE The value must be between 1 and 10

      Conclusion

      At this point, you should have a better understanding of some of the major data types that are available for you to use in PHP. Each of these data types will become important as you develop programming projects in the PHP language.



      Source link