Setup a private nuget server

Deploy nuget server project

Nuget Server package on nuget.org:
https://www.nuget.org/packages/NuGet.Server/2.11.3/

The project source:
https://github.com/NuGet/NuGet.Server

deploy the project to IIS, and configure the web.config

1. set the apiKey for pushing packages to the server
2. set packagesPath to store all the packages, the default will be ~/Packages (you need to give write permissions for the app pool user)

Pushing packages

1. using nuget CLI

https://docs.microsoft.com/en-us/nuget/tools/nuget-exe-cli-reference

nuget.exe push -Source {NuGet package source URL} -ApiKey key {your_package}.nupkg

2. using dotnet core CLI

https://docs.microsoft.com/en-us/dotnet/articles/core/tools/dotnet-nuget-push


dotnet pack --configuration release
dotnet nuget push foo.nupkg -k 4003d786-cc37-4004-bfdf-c4f3e8ef9b3a -s http://customsource/

Enable authentication for accessing nuget server

1. enable windows authentication on the server site on IIS
2. create a windows user
3. adding the repository source with username and password (using nuget CLI), it will be saved into the global nuget.config file. (normally in Nuget has a global nuget.config, saved in \Users\%AppUSer%\AppData\Roaming\NuGet)
nuget.exe sources add -name {feed name} -source {feed URL} -username {username} -password {PAT} -StorePasswordInClearText

If you don’t have the username & password, it will return a 401 unauthorized error. In visual studio, it will prompt a dialog asking for credentials.

Restore nuget packages using nuget config per solution

If you work on a new machine, and checkout source code of a project, you will need to configure the nuget source, username, password etc. To enable developers restore the packages and build the project without any hassles, we can create a nuget.config per solution.


<?xml version="1.0" encoding="utf-8"?>
<configuration>
<packageSourceCredentials>
<AWS_x0020_Nuget>
<add key="Username" value="spnugetuser" />
<add key="ClearTextPassword" value="SearchParty2017" />
</AWS_x0020_Nuget>
</packageSourceCredentials>
<packageSources>
<add key="AWS Nuget" value="http://nuget.searchparty.com/nuget" />
</packageSources>

</configuration>

Then you can call

nuget restore

or

dotnet restore

Advertisements

Cassandra tips

Use short column names

Column names take space in each cell, and if you use a big clustering key, it will be copied all over your clustered cells.

Eventually, we have found in some situations that column names (including clustering keys) take up more space than the data we wanted to store! So it is a good advice to use short column names, and short clustering keys.

You can write data in the future

Using the CQL driver you can explicitly set up the timestamp of each of your key/value pairs. One nice trick is to set up this timestamp in the future: that will make this data immutable until the date is reached.

Don’t use TimeUUID with a specific date

TimeUUID is a very common type for Cassandra column names, in particular when using wide rows. If you create a TimeUUID for the current time, this is no problem: your data will be stored chronologically, and your keys will be unique. However, if you force the date, then the underlying algorithm will not create a unique ID! Isn’t this surprising, for a “UUID” (Universal Unique Identifier) field?

As a result, only use TimeUUID if:

  • You use them at the current date
  • You force the date, but are OK with losing other data stored at the same date!

Don’t use PreparedStatement if you insert empty columns

If you have an empty column in your PreparedStatement, the CQL driver will in fact insert a null value in Cassandra, which will end up being a tombstone.

This is a very bad behavior, as:

  • Those tombstones of course take up valuable resources.
  • As a result, you can easily reach the tombstone_failure_threshold (by default at 100,000 which is in fact quite a high value).

The only solution is to have one PreparedStatement per type of insert query, which can be annoying if you have a lot of empty columns! But if you have multiple empty columns, shouldn’t you have used a Map to store that data in the first place?

Don’t use Cassandra as a queue

Using Cassandra as a queue looks like a good idea, as wide rows definitely look like queues. There are even several projects using Cassandra as a persistence layer for ActiveMQ, so this should be a good idea!

This is in fact the same problem as the previous point: when you delete data, Cassandra will create tombstones, and that will be bad for performance. Imagine you write and delete 10,000 rows, and then write 1 more row: in order to fetch that one row, Cassandra will in fact process the whole 10,001 rows…

Use the row cache wisely

By default Cassandra uses a key cache, but whole rows can also be cached. We find this rather under-used, and we have had excellent results when storing reference data (such as countries, user profiles, etc) in memory.

However, be careful of two pitfalls:

  • The row cache in fact stores a whole partition in cache (it works at the partition key level, not at the clustering key level), so putting a wide row into the row cache is a very bad idea!
  • If you put the row cache off-heap, it will be outside the JVM, so Cassandra will need to deserialize it first, which will be a performance hit.

Don’t use “select … in” queries

If you do a “select … in” on 20 keys, you will hit one coordinator node that will need to get all the required data, which can be distributed all over your cluster: it might need to reach 20 different nodes, and then it will need to gather all that data, which will put quite a lot of pressure on this coordinator node.

As the latest CQL driver can be configured to be token aware, you can use this feature to do 20 token aware, asynchronous queries. As each of those queries will directly hit the correct node storing the requested data, this will probably be more performant than doing a “select … in”, as you will gain the round trip to the coordinator node.

Configure the retry policy when several nodes fail

This of course depends whether you prefer to have high consistency or high availability: as always, the good thing with Cassandra is that this is tunable!

If you want to have good consistency, you probably have configured your queries to use a quorum (or a local_quorum is you have multiple datacenters), but what happens if you lose 2 nodes, considering you have the usual replication factor of 3? You didn’t lose any data, but as you lost the Quorum for some data, you will start to get failed queries! A good compromise would be to tune the retry policy and use the DowngradingConsistencyRetryPolicy : this will allow you to lower your consistency level temporarily, the time for you to restore one of the failed nodes and get your quorum back again.

Don’t forget to repair

The repair operation is very important in Cassandra, as this is what guarantees that you won’t have forgotten deletes. For example, this can happen when you had a hardware failure, and you bring the node back when some tombstones have expired on other nodes: Cassandra will see this deleted data as some new data (as tombstones have disappeared), and thus this data will be “resurrected” in your cluster.

Repairing nodes should be a regular and normal operation on your cluster, but as this has to be set up manually, we see many clusters where this is not done properly.

For your convenience, DataStax Enterprise, the commercial version of Cassandra, provides a “repair service” with OpsCenter, that does this job automatically.

Clean up your snapshots

Taking a snapshot is cheap with Cassandra, and can often save you after doing a wrong operation. For instance, a database snapshot is automatically created when you do a truncate, and this has already been useful to us on a production system!

However, snapshots take space, and as your stored data grow, you will need that space at one time or another: so a good process is to save those snapshots outside of your cluster (for example, uploading them to Amazon S3), and then clean them up to reclaim the disk space.

10 Tips and tricks for Cassandra

Modeling data with Cassandra: what CQL hides away from you

Vagrant

Vagrant provides easy to configure, reproducible, and portable work environments built on top of industry-standard technology and controlled by a single consistent workflow to help maximize the productivity and flexibility of you and your team.

Vagrant stands on the shoulders of giants. Machines are provisioned on top of VirtualBox, VMware, AWS, etc.

I am using virtualbox as an example, you can fire up a ubuntu box with a few lines of code.


# -*- mode: ruby -*-
# vi: set ft=ruby :

Vagrant.configure("2") do |config|

config.vm.box = "ubuntu/trusty64"

config.vm.synced_folder "./data", "/home/vagrant/data"
config.vm.provision "shell", path: "./scripts/vagrant/install-glance.sh"

config.vm.network "forwarded_port", guest: 8983, host: 8984, auto_correct: true

config.vm.network "private_network", ip: "192.168.33.10"

config.vm.provider "virtualbox" do |vb|

# Customize the amount of memory on the VM:
vb.memory = "8024"
end

end

You can forward the port from virtual box to your host, if the port is used with other programs, it can auto fix the port and assign a new one.


config.vm.network "forwarded_port", guest: 8983, host: 8984, auto_correct: true

You can set up a virtual ip for the box.


config.vm.network "private_network", ip: "192.168.33.10"

Setup a sync folder that can be access both in ssh and your host machine.


config.vm.synced_folder "./data", "/home/vagrant/data"

After you have created the VagrantFile, you can call

Vagrant Up

to fire up the box.

Once configuration VagrantFile has been changed, need to call

Vagrant Reload

to refresh the virtual box.

To destroy a virtual box, call

Vagrant Destroy

To access public built vagrant boxes,

https://atlas.hashicorp.com/boxes/search

http://www.vagrantbox.es/

SelectListItem Helper to create selectlistitems from enum

first, we create a extension method to get the descriptions of enum values.

public static string ToDescription&lt;T&gt;(this T enumValue)
            where T : struct, IConvertible, IComparable, IFormattable // Criteria for Enums
        {
            var fieldInfo = enumValue.GetType().GetField(enumValue.ToString());
            var attributes = fieldInfo.GetCustomAttributes(typeof(DescriptionAttribute), false).Cast&lt;DescriptionAttribute&gt;().ToList();
            return attributes.Any() ? attributes.First().Description : enumValue.ToLabel();
        }

second, create the selectlistitems from the enum.

public static List&lt;SelectListItem&gt; GetItemsForEnum&lt;TEnum&gt;(int? selectedValue = null, string defaultText = &quot;&quot;)
            where TEnum : struct, IConvertible, IComparable, IFormattable // Criteria for Enums
        {
            var results = new List&lt;SelectListItem&gt;();
            var values = Enum.GetValues(typeof (TEnum));
 
            if (!string.IsNullOrEmpty(defaultText ))
                results.Add(new SelectListItem { Text = defaultText , Value = &quot;&quot; });
 
            foreach (var value in values)
            {
                var name = ((TEnum)value).ToDescription();
                var selected = (selectedValue != null) &amp;&amp; selectedValue.Equals((int)value);
                results.Add(new SelectListItem { Text = name, Value = ((int)value).ToString(), Selected = selected });
            }
            return results;
        }

Override EF 5 database mapping

public class UserRepo
{
    private UserContext _context;
    public UserRepo(UserContext context)
    {
        _context = context;
    }
 
    public User Save(User user)
    {
        if (user.Id &lt;= 0)
        {
            _context.Users.Add(user);
        }
        else
        {
            _context.Users.Attach(user);
        }
        return user;
    }
}
 
public class User
{
    public int Id { get; set; }
    public string Name { get; set; }
}
 
public class UserMapping : EntityTypeConfiguration&lt;User&gt;
{
    public UserMapping()
    {
        HasKey(p =&gt; p.Id);
        Property(p =&gt; p.Id).HasDatabaseGeneratedOption(DatabaseGeneratedOption.Identity).HasColumnName(&quot;Id&quot;);
        Property(p =&gt; p.Name).HasMaxLength(100);
 
        ToTable(&quot;User&quot;);
    }
}
 
public class UserContext : DbContext
{
    public DbSet&lt;User&gt; Users { get; set; }
 
    protected override void OnModelCreating(DbModelBuilder modelBuilder)
    {
        modelBuilder.Configurations.Add(new UserMapping());
    }
 
    public UserRepo UserRepo
    {
        get
        {
            return new UserRepo(this);
        }
    }
}

SQL script to view sizes of all tables

SELECT
    t.NAME AS TableName,
    p.rows AS RowCounts,
    SUM(a.total_pages) * 8 AS TotalSpaceKB, 
    SUM(a.used_pages) * 8 AS UsedSpaceKB, 
    (SUM(a.total_pages) - SUM(a.used_pages)) * 8 AS UnusedSpaceKB
FROM
    sys.tables t
INNER JOIN     
    sys.indexes i ON t.OBJECT_ID = i.object_id
INNER JOIN
    sys.partitions p ON i.object_id = p.OBJECT_ID AND i.index_id = p.index_id
INNER JOIN
    sys.allocation_units a ON p.partition_id = a.container_id
WHERE
    t.NAME NOT LIKE 'dt%'
    AND t.is_ms_shipped = 0
    AND i.OBJECT_ID &gt; 255 
GROUP BY
    t.Name, p.Rows
ORDER BY
    t.Name