Access Log Exports to S3

Prerequisities

To use this feature, you'll need administrative access to an AWS account (in order to create an S3 bucket for receiving the log exports from Cloudsmith).

Setup

First of all, please familiarise yourself with the changelog post for the feature.

Next, you'll need to follow these steps:

1. Create an S3 bucket for the logs, such as cloudsmith-acmecorp-logs where you can replace acmecorp with your own organization name.

2. If using an existing bucket, pick a prefix for the Cloudsmith logs to go into, such as cloudsmith-logs. This is configured in the next step and configured on our side in tandem.

3. Contact us to tell us your AWS account ID, the name of the S3 bucket you've created, the name of the IAM Role you're going to create in the next step, such as CloudsmithLogsWriter, and the log format that you want to export as. We'll then tell you the External ID value that will be used during the creation of the role, which will be part of the authentication we use to assume your IAM Role. The use of External ID is a best practice recommended by AWS.

4. Create a new IAM Role for "Another AWS account", with the same name as determined in the previous step, and specify 884446598447 as the Account ID, tick "Require external ID", and then specify the External ID that we provided to you.

993

Creating a Role for writing Cloudsmith logs via IAM in AWS.

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Action": [
        "s3:PutObject",
        "s3:PutObjectAcl"
      ],
      "Effect": "Allow",
      "Resource": [
        "arn:aws:s3:::cloudsmith-acmecorp-logs/cloudsmith-logs/*"
      ]
    }
  ]
}

📘

Note: Bucket Name and Prefix

Make sure to replace cloudsmith-acmecorp-logs with your own bucket name, and replace cloudsmith-logs with your own prefix. If you don't need a prefix, delete the cloudsmith-logs/ string.

The IAM Role should have the following Role Trust Relationship Policy:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "AWS": "arn:aws:iam::884446598447:root"
      },
      "Action": "sts:AssumeRole",
      "Condition": {
        "StringEquals": {
          "sts:ExternalId": "your-external-id"
        }
      }
    }
  ]
}

📘

Note: External ID

Make sure the value for sts:ExternalId matches what Cloudsmith provided you.

5. Let us know when those are created, and we'll be able to get you set up on our side. If you'd like to only export logs for some repositories rather than all, please just let us know.
During the testing period, we'll be locking the frequency to daily (once per day) exports.

Summary of Information Required

You should have provided us with the following information:

  • ID of your AWS account (e.g. 995557609558, typically 12 digits).
  • Name of your S3 bucket (e.g. cloudsmith-acmecorp-logs).
  • Name of your S3 prefix (e.g. cloudsmith-logs), if any.
  • Name of your IAM Role that we'll use for exporting logs (e.g. CloudsmithLogsWriter).
  • The format of the logs you'd like to export (e.g. Apache-style, CSV, JSON Stream, JSON Stream + Timestamp, etc.)

📘

Which format should I choose?

It depends! Each format is "streaming," meaning there's one line per entry for each download rather than a large object. In terms of detail, the Apache (or Nginx) format has the least to conform with those formats, CSV has the next level of detail but is limited due to columns, and JSON has the most level of detail as a structured format.

We recommend choosing JSON Stream + Timestamp, in which a JSON blob representing the download is prefixed with a timestamp, making this nice to parse and import into your favorite Business Intelligence (BI) or Security Information Event Monitoring (SIEM) tooling.

We should have provided you with the following information:

  • An ExternalId value used for assuming your IAM Role, in order to export the logs.

Caveats

  • During the testing period, logs will be generated once per day.
  • Due to some internal limitations (that we'll be working on):
    • Some logs will not have a relevant link to the package that was involved in the download.
    • Some logs involving Basic Auth will not have a link to the user/token involved in the download.

Data Provided

Some of the data provided includes:

  • Edge Location: Nearest CDN Edge Node Location (e.g., "LHR61-C2")
  • EULA ID: Unique identifier for EULA accepted (if any)
  • EULA Number: Revision of EULA accepted (if any)
  • IP Address: IP of the client
  • Host: Hostname for the request (e.g. your download domain)
  • Method: HTTP Method (e.g. "GET")
  • Geo/IP Fields: E.g., "City," "Continent", e.g. enriched based on IP (we pay for this)
  • Package Identifier: Unique identifier for downloaded package (if any)
  • Package Name: Name for the downloaded package (if any)
  • Package Version: Version for downloaded package (if any)
  • Protocol: HTTP Protocol Used (e.g. "https/1.3")
  • Referer: HTTP Referrer
  • Request ID: Globally unique identifier for the request
  • Status: HTTP Status Code (e.g. "200")
  • Time Taken: Time taken to service request, in seconds
  • Token ID: Unique identifier for entitlement token used (if any)
  • Token Name: Name for entitlement token used (if any)
  • URI: Uniform Resource Identifier, i.e. path, for the request
  • User Agent: HTTP User Agent sent by the client
  • User ID: Unique identifier for the client user (if any)
  • User Name: Name for the client user (if any)

Format Examples

CSV

Header line and one record:

datetime,repository,status,method,uri,host,ip_address,bytes,city,country,edge,eula,format,package,recorded,referer,request_id,token,user,user_agent

2021-04-01T19:40:55+00:00,cloudsmith/testing-private,200,GET,/rpm/fedora/29/x86_64/cloudsmith-redhat-example-1.0.16173057145-1.x86_64.rpm,dl.cloudsmith.io,82.1.4.8,10421,Newmarket,GB,LHR61-C2,,RedHat,vmN1OlMDi3Iy,2021-04-01T19:46:23.817658+00:00,,dZ3DmJ8yHU36FyH9uHoyWH4LocmlkzdKXSEuOBmnvLhA4ZF7exa5Qw==,wKSkBFHpsNWB,eP1B3YCtTlJX,libdnf

📘

It is possible to include or exclude the following:

  • EULA acceptance data.
  • location-based (Geo) data.
  • IP data.
  • namespace identifiers/paths.
  • repository identifiers/paths.

We can provide examples of the other supported formats on request and will update this document with more examples.


Cloudsmith is the new standard in Package / Artifact Management and Software Distribution

With support for all major package formats, you can trust us to manage your software supply chain.


Start My Free Trial Now
Cookie Declaration (Manage Cookies)